MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: ByteCount of imported machine-precision data matrix three times

  • To: mathgroup at smc.vnet.net
  • Subject: [mg92191] Re: ByteCount of imported machine-precision data matrix three times
  • From: Szabolcs Horvát <szhorvat at gmail.com>
  • Date: Mon, 22 Sep 2008 07:10:27 -0400 (EDT)
  • Organization: University of Bergen
  • References: <gb7odj$nij$1@smc.vnet.net>

Gareth Russell wrote:
> Hi,
> 
> I am encountering some strange memory-related behavior when importing 
> numerical data from a file. If anyone is interested, a (small) example 
> file is here:
> 
> http://web.njit.edu/~russell/Mathematica.html
> 
> It's a simple 2D array of numbers. The issue is that when imported, 
> ByteCount[] indicates that the resultant expression takes up more than 
> three times as much memory as an equivalent machine-precision matrix 
> generated within Mathematica. All diagnostics that I can think of 
> indicate that the imported expression is equivalent in precision. And 
> indeed, ByteCount applied to individual elements of each matrix returns 
> 16 as an answer. It's only the overall ByteCount which is hugely 
> different.
> 
> I discovered a workaround: if I generate a dummy matrix of 0. elements 
> (which has the smaller ByteCount), and add it to the imported matrix, 
> the result, while appearing identical (as it should), now also has the 
> smaller ByteCount.
> 
> Does anyone know what it going on here? Until I discovered the 
> workaround it was a problem, as I need to read in a large number of 
> much larger matrices all together, was encountering memory issues.
> 

In fact not all the numbers are machine precision.  Try checking each 
number in the array:

Map[MachineNumberQ, data, {-1}]

To reduce the byte count we can first convert all numbers to machine 
precision, then pack the array:

In:= ByteCount[Developer`ToPackedArray@N[data]]
Out= 2320

A word of caution about using ByteCount for checking memory use:  if we 
have a  list y={x,x,x,x}, its byte count will be four times as large as 
x's (plus a few bytes), but in fact x may be stored only a *single* time 
by Mathematica, and y might just reference the same piece of memory four 
times.


  • Prev by Date: Re: Changing names of variables in a loop
  • Next by Date: Re: hi,friends(8)
  • Previous by thread: Re: How to collect terms based the total power of x and y
  • Next by thread: Re: ByteCount of imported machine-precision data matrix three times