MathGroup Archive 2007

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: ReadList -- file size limits?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg82878] Re: ReadList -- file size limits?
  • From: Aranthon <a.dwarf at gmail.com>
  • Date: Fri, 2 Nov 2007 03:27:33 -0500 (EST)
  • References: <fg9pau$moq$1@smc.vnet.net><fgcafp$9ml$1@smc.vnet.net>

This gets back to a question I asked a few weeks ago - does
Mathematica have any hard-coded file limits?  I have an enormous file
(results of a simulation - basically a 104 x 108 x 21 x 5500 block of
32-bit floats in binary format) that I'd like to analyze in
Mathematica.  Obviously, since the file weighs in at about 5 GB, I
won't be reading it in whole.  But when I try to just open the file,
Mathematica immediately tells me that I'm at the end of the file, even
before I start reading.

I know that the file isn't corrupt, since I can read it perfectly well
using another CAS package.  It's not a show-stopping point, since I
wrote a little C++ program that will let me extract a certain range of
steps and save them in a new file.  I'm just curious about what's
causing the problem, and how large you can go before Mathematica won't
even look at the file.

Cheers,

Greg

On Nov 1, 6:40 am, David Bailey <dave at Remove_Thisdbailey.co.uk> wrote:
> david.sedar... at forbrf.lth.se wrote:
> > Is ReadList limited in some way regarding the amount of data it can read
> > in?
>
> > I have a large data file (~450M text file) and I'm doing something like
> > the following:
>
> > stream = OpenRead[fullname]; (*open file for reading*)
> > header = ReadList[stream, String, 1](*read header strings*)
> > header = Flatten[StringSplit[header]];
> > data = ReadList[stream, Number, 8(1=D710^6)]);(*read 8 columns of number
> > data*)
> > Close[stream];
>
> > Apparently Readlist doesn't read in the whole file.  The code shown above
> > runs fine, but data has max length of: 3278320, which corresponds to
> > 409790 lines of the data file.  Can anyone clue me in as to why this is?
> > Is this a suitable approach for reading numbers from a very large ascii
> > file?
>
> > thanks,
>
> > DS
>
> First, I would check that your data file is not corrupted in some way. I
> have seen data files like this generated by equipment of various sorts
> that threw the occasional glitch!
>
> Anyway, 450M is a really large file, and if Mathematica reads it all in
> before processing, that is a very large chunk of memory before it even
> starts processing. If you are using a 32-bit operating system (such as
> 32-bit Windows) that will have already consumed a fair bit of your
> addressable memory. There may also be some hard-coded limits in Mathematica.
>
> Since ReadList can take a stream argument, which stays open after the
> call, you could read the data in chunks, and assemble it afterwards - or
> even process it in chunks and avoid holding it all in memory at the same
> time.
>
> An ultimate solution would be to open and read the file in Java (using
> J/Link).
>
> It is hard to be specific without seeing the file - which would probably
> be a bit large to append to your message:)
>
> David Baileyhttp://www.dbaileyconsultancy.co.uk




  • Prev by Date: Re: Manipulating a complex modulus expression
  • Next by Date: Re: Re: Re: Setting Negatives to Zero
  • Previous by thread: Re: ReadList -- file size limits?
  • Next by thread: Re: "Accumulate" in Mathematica 6