Re: ByteCount of imported machine-precision data matrix three times

*To*: mathgroup at smc.vnet.net*Subject*: [mg92205] Re: ByteCount of imported machine-precision data matrix three times*From*: Jean-Marc Gulliet <jeanmarc.gulliet at gmail.com>*Date*: Tue, 23 Sep 2008 07:30:38 -0400 (EDT)*Organization*: The Open University, Milton Keynes, UK*References*: <gb7odj$nij$1@smc.vnet.net>

Gareth Russell wrote: > I am encountering some strange memory-related behavior when importing > numerical data from a file. If anyone is interested, a (small) example > file is here: > > http://web.njit.edu/~russell/Mathematica.html > > It's a simple 2D array of numbers. The issue is that when imported, > ByteCount[] indicates that the resultant expression takes up more than > three times as much memory as an equivalent machine-precision matrix > generated within Mathematica. All diagnostics that I can think of > indicate that the imported expression is equivalent in precision. And > indeed, ByteCount applied to individual elements of each matrix returns > 16 as an answer. It's only the overall ByteCount which is hugely > different. *snip* You will get the most compact form only when your data are made of numeric values of same type (say, all machine integers or all floating-point numbers) which is not the case for your imported dataset (testing the first element only is not enough, see below). If it were the case, Mathematica would use a packed array representation [1]. Something along the line data = Developer`ToPackedArray[ N[Drop[Import["http://web.njit.edu/~russell/Downloads/12e.dat";], 6]]]; will do what you want. Here is a step by step illustration of what is going on before and after packed arrays are used. In[1]:= data = Drop[Import["http://web.njit.edu/~russell/Downloads/12e.dat";], 6]; In[2]:= Dimensions[data] Out[2]= {20, 14} In[3]:= ByteCount[data] Out[3]= 7384 In[4]:= Developer`PackedArrayQ[data] Out[4]= False In[5]:= MatrixQ[data, MachineNumberQ] Out[5]= False In[6]:= data = Developer`ToPackedArray[N[data]]; In[7]:= ByteCount[data] Out[7]= 2360 In[8]:= Developer`PackedArrayQ[data] Out[8]= True In[9]:= MatrixQ[data, MachineNumberQ] Out[9]= True Regards, - Jean-Marc [1] /Performance of Linear Algebra Computation/ http://reference.wolfram.com/mathematica/tutorial/LinearAlgebraPerformance.html