MathGroup Archive: June 2010 [00145]

[Date Index] [Thread Index] [Author Index]

Re: Export

To: mathgroup at smc.vnet.net
Subject: [mg110250] Re: Export
From: Albert Retey <awnl at gmx-topmail.de>
Date: Thu, 10 Jun 2010 08:08:48 -0400 (EDT)
References: <huntgi$bph$1@smc.vnet.net>

Hi,
> 
> Thank you for youre response. I just read this post and in the
> meantime I found this work around:
> 
> read a file and skip so many records down.. (here i skip so many
> iterations...)
> 
> strm = OpenRead[ToFileName[{NotebookDirectory[]},
> "1laserGrid.txt"]]; If[iter != 1, Skip[strm, Record, (iter -
> 1)*Nz]]; EL = ReadList[strm, Expression]; Close[strm];
> 
> append to a file strm = OpenAppend[ToFileName[{NotebookDirectory[]},
> "1laserGrid.txt"], FormatType -> OutputForm]; WriteString[strm, "\n"
> <> StringDrop[ StringDrop[ StringReplace[ToString[ELNewNew,
> InputForm], "," -> "\n"], 1], -1]]; Close[strm];
> 
> Both your solution and this one seem to give me the same problem
> which I will describe below:
> 
> The reason for using export and import is that I max out the 32 GB
> RAM on my PC. So to keep RAM down I use file I/O.
> 
> While the "write" stream takes no time at all and doesn't depend on
> the current file size the read command gets bogged down as the file
> gets larger and larger... I naively thought that using "skip" in the
> read would prevent the processor from reading in the entire file
> which starts to take a very long time as the file sizes approch
> 100,000 KB's...

I think that your expectation was wrong, since when you want to use
something like Skip, then of course the program needs to read the file
to analyze where Records start and end. If you want to avoid a complete
reread, you will need to explicitly tell the program where to read and
write by giving it a position in the file, as you would have to in other
programming languages. There are StreamPosition and SetStreamPosition
for that purpose and I think that these should make it possible to read
and write to a large file without too much (or any) overhead. One thing
is that now you have to remember the position of where in the file your
records are. Of course doing so has a certain risk of destroying the
data in the file, if you write to wrong positions, so you'd better test
that code well before using it in production.

> This is making simulations virtually imposibble to run since they are
> taking 10 hours primarily because I keep looping through this read
> command everytime I propogate my PDE ...and each one is taking a
> minute or so...
>
> I'm at a loss on what to do here.....
>
> I have no choice but to use file I/O do to RAM limitations but i
> don't see a way around my problem there either :(
>
> One way I thought of was to maintain two data files.. "one where i
> store all my data" and the second one "where I just store the last 2
> iterations of data since that is all I need to propogate my pde
> forward in time anyway".. However, I thought maybe I would not have
> to do that if I could use non-sequential data pulling from a file in
> Mathematica but I guess that isn't possible?

If I understand your problem correctly, you only will need to remember
the last two write positions, so that shouldn't be a large burden...
Basically I think your idea with 2 files isn't too bad, it will probably
be simpler and just as fast as the 1 file version with StreamPositions.
There is a third option to handle things like this: you could store the
data in a database and read and write to the database using
DatabaseLink. I have done that with good success, but of course it needs
a little knowledge about databases and some experimentation with the
DatabaseLink`-package.


hth,

albert

Prev by Date: Re: PDE, laplace, exact, should be simple...

Next by Date: Re: Using Mathematica to create simple HTML webpages

Previous by thread: Re: Export

Next by thread: Hamiltonian cycles on directed graphs