MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Processing large data sets

  • To: mathgroup at smc.vnet.net
  • Subject: [mg94176] Processing large data sets
  • From: Ravi Balasubramanian <bravi at cs.washington.edu>
  • Date: Sat, 6 Dec 2008 06:17:07 -0500 (EST)

Hello friends,

I have to process many (10) large data sets (each ~150 MB, 500K rows, 19 
columns) of numbers with 4 significant digits with operations like 
Principal Component Analysis (on all the files at the same time!).  The 
large data sets are causing the memory to fill up quickly.

Admittedly, the data can still be downsampled along the rows, and I will 
do that.  I also maxed out the RAM on my machine (OSX 10.4, Mac Book 
Pro, 3 GB RAM).

But are there any other ways to do this, for example, modify the amount 
of memory each entry of the table takes up?  Currently, I use Import to 
load the table into memory.  Help is appreciated.

Ravi
University of Washington
Seattle, WA.


  • Prev by Date: Drop elements from list
  • Next by Date: VectorStyle for centered vector tail?
  • Previous by thread: Re: Drop elements from list
  • Next by thread: Re: Processing large data sets