MathGroup Archive 2007

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Designing a Flexible Mathematica Program for Data Analysis

  • To: mathgroup at smc.vnet.net
  • Subject: [mg74708] Re: Designing a Flexible Mathematica Program for Data Analysis
  • From: "Szabolcs" <szhorvat at gmail.com>
  • Date: Sun, 1 Apr 2007 04:19:14 -0400 (EDT)
  • References: <eul01m$87b$1@smc.vnet.net>

On Mar 31, 8:45 am, "5000brians" <5000bri... at gmail.com> wrote:
> I'm using Mathematica to perform data analysis for a project I've been
> working on for a few years. I have written quite a bit of Mathematica
> code for automating the task of analyzing the data.
>
> As time goes by, the type of data I collect changes slightly. I'm
> looking for a way for the various data set types I have to happily
> coexist within a single Mathematica session so I can compare and
> contrast all the data with a minimum of fuss.
>
> Let's say I want to create a plot of dataset 1 and dataset 2. In both
> cases, I want the same thing, a plot of parameter a versus parameter
> b. And let's say I have a bunch of code already written and it works
> just fine for dataset 1. But, dataset 2 is newer, and the data format
> is slightly different. The parameters a and b mean the same thing in
> both cases, it's just that they are represented differently in dataset
> 2 than in dataset 1.
>
> And let's say I have the following functions:
> readDataset: opens the text file and puts dataset into memory
> generateAandB: based on the data in the dataset, creates a list of
> parameters a and b for a dataset
> plotAandB: creates a plot of a vs b.
>
> In this case, I probably need two versions of readDataset and
> generateAandB, one for each of my dataset types. If I am using a
> List[], I am probably ok with one plotAandB function.
>
> But how can I automate this solution? What if I want to read in 30
> datasets with 10 different dataset types?
>
> I can't do plotAandB[generateAandB[readDataset[#]]]& /@
> listOfDatasets, because I need 10 different generateAandBs and 10
> different readDatasets.
>
> How can I make my code "datatype aware"?
>
> I know this is a bit long winded - sorry about that.
>
> Thanks for any help,
> Brian

I am not sure I understand what your problem is here. Could you give a
specific example of how you read and plot the data at the moment?

If you just want a single function name that you can map to the list
of datasets, you could carry a label together with the data, and
instead of having functions generateAandB1, generateAandB2, ..., use

generateAandB[{dataTypeOne, actualData_}] := (do something with
actualData)
generateAandB[{dataTypeTwo, actualData_}] := (do something else with
actualData)

(Or use the head of the expression as a label, if this is more
suitable for your application: dataTypeOne[actualData])

Mathematica's symbolic expressions are very flexible. You can use them
to represent many things.

Do you have file names inside listOfDatasets? If you can determine the
data format from the file name, you could use

readDataset[filename_?dataFormatOneQ] := {dataTypeOne, (read the
data)}
etc.

Here the function dataFormatOneQ returns True if the file contains
data in the "first" format. If the file names are not enough to
determine the format, map a function to listOfDatasets that reads into
the files and labels all the file names.

With this approach, you still have separate implementations for all
your data formats, but now you have a single function that you can map
to the list of file names.

Does this help to solve the problem?

Szabolcs



  • Prev by Date: Re: Designing a Flexible Mathematica Program for Data Analysis
  • Next by Date: Re: Designing a Flexible Mathematica Program for Data Analysis
  • Previous by thread: Re: Designing a Flexible Mathematica Program for Data Analysis
  • Next by thread: Re: Designing a Flexible Mathematica Program for Data Analysis