MathGroup Archive 2012

[Date Index] [Thread Index] [Author Index]

Search the Archive

Structure of "identical" data not equal in size

  • To: mathgroup at smc.vnet.net
  • Subject: [mg126915] Structure of "identical" data not equal in size
  • From: Paul E McHale <paulmchale at me.com>
  • Date: Sun, 17 Jun 2012 03:59:02 -0400 (EDT)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com

(* First, the internal data and then writing it to CSV   *)
m=Table[{i,Sin[i],Cos[i]},{i,1,1000,1.0}];
SetDirectory[NotebookDirectory[]]
Export["Test2.csv",m, "CSV"]

(*Reading it back in *)
mIn = Import["Test2.csv", "CSV"]

(* compare data *)
mIn==m

>> True


(* see memory usage *)
ByteCount[m]
>> 24168

ByteCount[mIn]
>> 144040

(* what is the minimum size using reals *)
1000 * 3 * 8
>> 24000

Actual size: 24168
Read in file data: 144040

(* Size *)
ByteCount[mIn]/ByteCount[m] * 1.0
>> 5.96

--------------

Why is the "same" data taking up 6x the memory after being written to disk and read back in.  This is a serious problem as we have large data at work being shared by files and Mathematica is currently the only language that can't read it (C# being the other language).

How can they pass for equal yet internally generated takes much less memory?

Any input welcome as I can only reduce the data with an external editor so I can try to work with it.


Thanks,
Paul








  • Prev by Date: power of logistic distribution
  • Next by Date: Re: Tracing of Manipulate solutions
  • Previous by thread: Re: power of logistic distribution
  • Next by thread: Re: Structure of "identical" data not equal in size