random line in a very large file
- To: mathgroup at smc.vnet.net
- Subject: [mg116980] random line in a very large file
- From: Ramiro <ramiro.barrantes at gmail.com>
- Date: Sun, 6 Mar 2011 05:44:01 -0500 (EST)
Hi everyone, I have very large files (close to 1gb). I want to find a random line on it, I wanted to compare the Mathematica native commands, versus calling a unix command such as sed. For example: file = "example"; n = 1000000; Export[file, Range[n], "List"]; i = RandomInteger[{1, n}]; str = OpenRead[file]; Skip[str, "String", i]; sample1 = Read[str, Expression]; Print[sample1]; Close[str]; QUESTION: 1) is this the most efficient way to do it in Mathematica, it's taking very long for my purposes on my files (note, the sample file above is small in comparison with the real data) 2) how can I call a command such as: sed '52q;d' example I am trying to do: cmd= "!sed '52q;d' example" and then << cmd but it's not working. I am not sure how to use << in such a way, nor how to mix characters such as ' and ". Any suggestions? Thanks in advance, any help appreciated, Ramiro p.s. by the way, the sed command above _seems_ to be faster on my files from the command line, that is why I have the option of using them.