Re: structure array equivalent in Mathematica
- To: mathgroup at smc.vnet.net
- Subject: [mg67239] Re: structure array equivalent in Mathematica
- From: albert <awnl at arcor.de>
- Date: Wed, 14 Jun 2006 06:28:46 -0400 (EDT)
- References: <NDBBJGNHKLMPLILOIPPOAEHIFAAA.djmp@earthlink.net> <e6li4e$ni8$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Hi Kevin, > Yes, this is definitely along the lines I was hoping for. Of course, > Part is the primary means of extracting elements of an array. > However, I need a means of assigning names to elements of those > lists. I deal with reasonably large datasets where there may be many > elements in a list and trying to remember that elements of list 78 > are quality flags is not very effective. > > I guess the difference here is that the original data is still in one > nested list with the pure functions there to extract the appropriate > components whereas with a structure array the data itself is already > organized in that fashion. > > I will see where I can go with this approach. If you are handling large datasets you should definitly use the unstructred nested lists to store your data and work with other means (basically defining 'access functions') to not need to remember the indices of certain data types. I have included an answer I formulated yesterday but forgot to post along the lines of the other answers, but maybe there is something usefull within that. hth albert the following is from yesterday :-) > Like many people I imagine, I'm transitioning to Mathematica from a > background in another system. > One of the common data types is the structure array.  Let's say I have > an observational data set that includes pressure, temperature, and > water vapor as a function of altitude.  So, in pseudo-code I might > define a structure as > > observation = {pressure: float(100), temperature: float(100), > water_vapor: float(100)} no declaration needed in mathematica: you can put everything into a list at any place. For your example I start of with some random data, of course you will get your data from somewhere: pressure1=Table[Random[],{100}] temperature1=Table[Random[],{100}] watervapor1=Table[Random[],{100}] the usual way to store these values is to just put them in a list: observation1 = {pressure1,temperature1,watervapour1} > Furthermore, I could aggregate these observations into a larger list, e.g. > obs_day = {observation, observation, observation} > to be accessed as > obs_day[1].pressure for the first element (assuming 1-index). having done the above for three different observations you can collect them into a single list: observationdays = {observation1,observation2,observation3}  Of course usually you will construct this list in a different way by reading a file or getting data from a database. Once the data is in this form you can access the data like this: observationdays[[1,1,55]] which is the short hand notation for Part[observationdays,1,1,55] for the 55th pressure of day one. > I could then access the elements of this observation as > > observation.pressure > observation.temperature, etc. > > Now, the list in Mathematica is quite powerful and I think can be > set-up in a similar fashion. > > So my question is how is the structure array commonly implemented in > Mathematica or its equivalent? The above of course does not give you an obvious possibility to see which part of the list has which meaning, there are no names for the list entries. There is nothing like "structs" in mathematica as you are used to, but there are plenty possibilities to make the data appear more structured in mathematica of which you have to choose what's appropriate for your problem. Here are some possibilities, there are a lot more, maybe something else is better for your purposes, but that I can't say: 1) Define "names" for the pressure/temperature/watervapor parts of the data, like: pressure=1 temperature=2 watervapor=3 then you can use: observationdays[[1,pressure,55]] which makes your code easier to read but leaves the data as it was. 2) Define functions for accessing the data (again the data will stay in just a big list of lists): Clear[pressure,temperature,watervapor] pressure[x_List]:=x[[1]] temperature[x_List]:=x[[2]] watervapor[x_List]:=x[[3]] and use it like: pressure[observationdays[[1]]] or: pressure@observationdays[[1]] 3) another widly used approach to organize data in mathematica is the use of rules, like: Clear[pressure,temperature,watervapor] observation1={pressure->Table[Random[],{100}],temperature->Table[Random[] {100}],watervapor->Table[Random[],{100}]} which is probably closest to the "structs" you are trying to imitate. Then you can access the pressures by applying this rule to pressure: pressure /. observation1 of course you can define various versions of access-functions for this construct to. 4) use downvalues instead of lists, like: Clear[observationday] observationday[1]=observation1 observationday[2]=observation2 ... then you can access the data with single brackets: observationday[1] Combine this with whatever seems appropriate from 1 to 3. This is often more usefull than constructing a big list with many calls to Append or AppendTo. Since I suppose you to work with rather large datasets you should note that Mathematica will store and access arrays of just numbers which are all of the same type (Reals here) much more efficiently (look up Developer`ToPackedArray if you are interested) than any constructs which contain rules (or anything else) so if you have large datasets I would recommend to use the simple datastructure as in examples 1 and 2, make sure they are transformed to PackedArrays (which Mathematica usually manages on it's own). Then write "access"-functions which fit your need as well as possible as explained above. It might be helpfull when writing bigger programs to wrap a special header around the monstrous list which makes clear what the data is to be interpreted like, e.g.: odata = observationdata[{observation1,observation2,observation3}] you can then use that header for checks of arguments in your accessfunctions: observationday[data_observationdata,day_,what_]:=data[[1,day,what/ {pressure->1,temperature->2,watervapor->3}]] observationday[data_observationdata,day_,what_,index_]:=data[[1,day,what/ {pressure->1,temperature->2,watervapor->3},index]] then access the data in odata with: obeservationday[odata,1,pressure][[55]] obeservationday[odata,1,pressure,55] you can put in more information about the data and/or define special formats etc. for observationdata if needed, even combining the big list with the above mentioned rule-based approach: odata = observationdata[{observation1,...},Name->"name of dataset",StartDate->{2006,03,01},EndDate->{2006,04,01}] Format[observationdata[data_,info___]]:=StringJoin["observationdata[<",Name/ {info},">]"] from there you probably can see how to improve further, according to your needs. Depending on the structure of your data it might also be interesting to look into the documentation for SparesArray, too.