MathGroup Archive: November 2009 [00200]

[Date Index] [Thread Index] [Author Index]

Re: Conventional way of doing "struct"-like things?

To: mathgroup at smc.vnet.net
Subject: [mg104627] Re: Conventional way of doing "struct"-like things?
From: John Jowett <john.m.jowett at gmail.com>
Date: Thu, 5 Nov 2009 03:50:11 -0500 (EST)
References: <hcl43p$c1v$1@smc.vnet.net>

Erik,
         I encountered this kind of thing many years ago for a
particular kind of table format that happens to occur in my work.
With the help of a student, I developed a package to work with them
(including the ability to interface to CSV tables).  I and other
colleagues are still using them.  The basic idea was that of "self-
describing" data sets.  For the case of CSV format,  this means that
the first row of data is a set of strings labelling the columns (your
ZIPCODE, etc.).  Actually our tables include other non-column items
before this header line, which also describe themselves, e.g.,

"DATE","23/5/2009"
"Q1",63.347823

Then the idea was to write functions to manipulate such table objects
in all the ways you might want, e.g.,

data=mfsFromCSV["datafile.csv"];

This returns an object with head mfs. To find out what is contained in
data

mfsColumnNames[data]

would return
{"ZIPCODE", "LATITUDE"," LONGITUDE","CITY", "STATE", "COUNTY",
"CLASS"}

To extract a list of rows containing data from just two columns:

mfsRow[data,{"LATITUTE","LONGITUDE"}]

The point is just that you never needed to think about indexing the
order of the columns because the data describes itself.  XML is a
later, much more general, approach, of course.

There are many other functions, the mfs data objects remember which
file they came from, etc, etc.

If you are interested in the package, let me know.

John Jowett




On Nov 2, 12:03 am, Erik Max Francis <m... at alcyone.com> wrote:
> I find myself (as I'm sure everyone does) dealing with lists of
> structured data, e.g., a list of lists, each of which contains n
> elements with each entry representing a unique field.  What is the most
> conventional way to identify each of these elements when iterating over
> the whole list?  Obviously I can use [[...]]/Part, but I'm wondering if
> there's something considered more elegant.
>
> Take a concrete example where I have a CSV containing ZIP code data,
> with seven fields:  the zip code, latitude, longitude, city, state,
> county, and class.  What's the usual way of addressing this?  I can
> think of two obvious ones; just define symbolic names for the indices:
>
> data = Import[..., "CSV"];
>
> {ZIPCODE, LATITUDE, LONGITUDE, CITY, STATE, COUNTY, CLASS} = Range[7];
>
> ListPlot[{#[[LONGITUDE]], #[[LATITUDE]]} & /@ data,
>   PlotStyle -> PointSize[0.001]]
>
> or define functions (with or without the symbolic index names):
>
> latitude[x_] := x[[2]];
> longitude[x_] := x[[3]];
>
> ListPlot[{longitude[#], latitude[#]} & /@ data,
>   PlotStyle -> PointSize[0.001]]
>
> What's consider more in Mathematica's style, or is there something else
> more commonly used I'm not thinking of?
>
> On that general subject, are there generally-respected Mathematical
> style guides floating around somewhere?  I'm not necessarily looking fo=
r
> rigid codified rules, just for "accepted" (more or less) conventional
> ways of getting things done.  I'm relatively new to Mathematica but by
> no means new to programming and am fluent in many other languages (C,
> C++, Python, etc.), so I'm just looking for ideas about "the way things
> are done" that go well with Mathematica's functional nature but are
> still self-documenting as much as possible.
>
> Thanks.
>
> --
> Erik Max Francis && m... at alcyone.com &&http://www.alcyone.com/max/
>   San Jose, CA, USA && 37 18 N 121 57 W && AIM/Y!M/Skype erikmaxfrancis
>    The chicken was the egg's idea of getting more eggs.
>     -- Samuel Butler

Prev by Date: Re: Prufer Code/ LabeledTreeToCode Bug?

Next by Date: Re: MathKernel7 produces no Graphics

Previous by thread: Re: Conventional way of doing "struct"-like things?

Next by thread: Re: Fit function vs Hand Calculation