Structure of "identical" data not equal in size
- To: mathgroup at smc.vnet.net
- Subject: [mg126915] Structure of "identical" data not equal in size
- From: Paul E McHale <paulmchale at me.com>
- Date: Sun, 17 Jun 2012 03:59:02 -0400 (EDT)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
(* First, the internal data and then writing it to CSV *)
m=Table[{i,Sin[i],Cos[i]},{i,1,1000,1.0}];
SetDirectory[NotebookDirectory[]]
Export["Test2.csv",m, "CSV"]
(*Reading it back in *)
mIn = Import["Test2.csv", "CSV"]
(* compare data *)
mIn==m
>> True
(* see memory usage *)
ByteCount[m]
>> 24168
ByteCount[mIn]
>> 144040
(* what is the minimum size using reals *)
1000 * 3 * 8
>> 24000
Actual size: 24168
Read in file data: 144040
(* Size *)
ByteCount[mIn]/ByteCount[m] * 1.0
>> 5.96
--------------
Why is the "same" data taking up 6x the memory after being written to disk and read back in. This is a serious problem as we have large data at work being shared by files and Mathematica is currently the only language that can't read it (C# being the other language).
How can they pass for equal yet internally generated takes much less memory?
Any input welcome as I can only reduce the data with an external editor so I can try to work with it.
Thanks,
Paul