MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Efficient way of reading matrices

  • To: mathgroup at smc.vnet.net
  • Subject: [mg85237] Re: [mg85229] Efficient way of reading matrices
  • From: "Szabolcs HorvÃt" <szhorvat at gmail.com>
  • Date: Mon, 4 Feb 2008 03:06:09 -0500 (EST)
  • References: <200802030432.XAA11313@smc.vnet.net>

On Feb 3, 2008 8:13 PM, Sseziwa Mukasa <mukasa at jeol.com> wrote:
>
> On Feb 2, 2008, at 11:32 PM, Szabolcs Horvát wrote:
>
> >
> > I need to read in some very big integer matrices from text files, the
> > biggest of which have a dimension of approx 200*250000.
> >
> > What is the most efficient way of doing this?
> >
> > The problem is that ReadList[#, Number, RecordLists->True] cannot do
> > this without swapping (it takes more than 10 minutes to finish), but
> > after the matrix has been read in and compacted with ToPackedArray, it
> > only takes up 200 MBs, and I can work with it comfortably.
> >
> > Are there more efficient ways to read in such big files?  Is there any
> > way to tell Mathematica that all the integers are machine-size and the
> > matrix is rectangular (i.e. not a ragged array)?
> >
> > A last resort would be to try to write a MathLink program for reading
> > the data but I would like to avoid this ...
>
> If you are reading machine numbers then BinaryReadList seems more
> appropriate.

By "machine-size number" I meant that the numbers are guaranteed to
fit into 32 bits, therefore they can be stored in a packed array.
Mathematica can deal with numbers of arbitrary size, and the fact that
it does not know beforehand how big the numbers in the file are going
to be is probably one of the reasons why it uses so much memory while
reading the file.



  • Prev by Date: Re: Specifiying finiteness condition using NDSolve
  • Next by Date: pattern matching against the Dt function?
  • Previous by thread: Re: Efficient way of reading matrices
  • Next by thread: Re: Efficient way of reading matrices