MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

random line in a very large file

  • To: mathgroup at smc.vnet.net
  • Subject: [mg116980] random line in a very large file
  • From: Ramiro <ramiro.barrantes at gmail.com>
  • Date: Sun, 6 Mar 2011 05:44:01 -0500 (EST)

Hi everyone,

I have very large files (close to 1gb).  I want to find a random line
on it, I wanted to compare the Mathematica native commands, versus
calling a unix command such as sed.  For example:

file = "example";
n = 1000000;
Export[file, Range[n], "List"];
i = RandomInteger[{1, n}];

str = OpenRead[file];
Skip[str, "String", i];
sample1 = Read[str, Expression];
Print[sample1];
Close[str];

QUESTION:
1) is this the most efficient way to do it in Mathematica, it's taking
very long for my purposes on my files (note, the sample file above is
small in comparison with the real data)
2) how can I call a command such as:

sed '52q;d' example

I am trying to do:

cmd= "!sed '52q;d' example"

and then

<< cmd

but it's not working. I am not sure how to use << in such a way, nor
how to mix characters such as ' and ".

Any suggestions?

Thanks in advance, any help appreciated,
Ramiro
p.s. by the way, the sed command above _seems_ to be faster on my
files from the command line, that is why I have the option of using
them.


  • Prev by Date: Re: Set diagonal of square matrix
  • Next by Date: Re: Alternative to DumpSave
  • Previous by thread: Re: How to avoid repeated calculation in NDSolve ?
  • Next by thread: Re: random line in a very large file