MathGroup Archive 2013

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Obtaining Random LIne from A file

  • To: mathgroup at smc.vnet.net
  • Subject: [mg129823] Re: Obtaining Random LIne from A file
  • From: David Bailey <dave at removedbailey.co.uk>
  • Date: Sun, 17 Feb 2013 04:07:53 -0500 (EST)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com
  • Delivered-to: l-mathgroup@wolfram.com
  • Delivered-to: mathgroup-newout@smc.vnet.net
  • Delivered-to: mathgroup-newsend@smc.vnet.net
  • References: <kfn7nt$qaj$1@smc.vnet.net>

On 16/02/2013 06:07, Ramiro Barrantes wrote:
> Hello,
>
> I would like to get a random line from a file, I know this can be done
> with Mathematica but I am playing with using sed to see if it goes
> faster, say I want to get line 1000
>
> In mathematica it would be:
>
> <<"! sed -n p1000 filename.txt"
>
> However, I am trying to put the filename as a variable, say
>
> filename="hugefile.txt"
>
> cmd="! sed -n p1000 "<>filename
> <<cmd
>
> does not work.
>
> How can I do this?
>
> Lastly, I am getting a randomline using mathematica doing:
>
> getRandomLine[file_, n_] :=
>    Block[{i = RandomInteger[{1, n}], str = OpenRead[file], res},
>     Skip[str, "String", i];
>     res = Read[str, Expression];
>     Close[str];
>     res[[2]]
>     ]
>
> However, it is very slow so I was going to try with sed.Any suggestions?
>
> Thanks in advance,
> Ramiro
>
>
I would stick with Mathematica to do this job! How big is the file 
(number of lines and number of bytes)? If it will fit inside Mathematica 
comfortable, I'd see how it works to read it all in as a list of strings 
and pick the one you want:

xx=ReadList["C:\\some file",String];//Timing

Then you have an array of strings, and you can select what you want 
directly.

Remember, the basic problem with reading at an arbitrary position in a 
text file, is that if the line lengths are not the same, any algorithm 
has to read every line before the one you want! If you create this file, 
you should consider packing the lines to make them all the same length - 
then you could access what you want very efficiently (but with a little 
more coding!)

David Bailey
http://www.dbaileyconsultancy.co.uk





  • Prev by Date: Re: Executing external command with parameters
  • Next by Date: Re: barchart
  • Previous by thread: Obtaining Random LIne from A file
  • Next by thread: Re: Obtaining Random LIne from A file