Re: Designing a Flexible Mathematica Program for Data Analysis

*To*: mathgroup at smc.vnet.net*Subject*: [mg74708] Re: Designing a Flexible Mathematica Program for Data Analysis*From*: "Szabolcs" <szhorvat at gmail.com>*Date*: Sun, 1 Apr 2007 04:19:14 -0400 (EDT)*References*: <eul01m$87b$1@smc.vnet.net>

On Mar 31, 8:45 am, "5000brians" <5000bri... at gmail.com> wrote: > I'm using Mathematica to perform data analysis for a project I've been > working on for a few years. I have written quite a bit of Mathematica > code for automating the task of analyzing the data. > > As time goes by, the type of data I collect changes slightly. I'm > looking for a way for the various data set types I have to happily > coexist within a single Mathematica session so I can compare and > contrast all the data with a minimum of fuss. > > Let's say I want to create a plot of dataset 1 and dataset 2. In both > cases, I want the same thing, a plot of parameter a versus parameter > b. And let's say I have a bunch of code already written and it works > just fine for dataset 1. But, dataset 2 is newer, and the data format > is slightly different. The parameters a and b mean the same thing in > both cases, it's just that they are represented differently in dataset > 2 than in dataset 1. > > And let's say I have the following functions: > readDataset: opens the text file and puts dataset into memory > generateAandB: based on the data in the dataset, creates a list of > parameters a and b for a dataset > plotAandB: creates a plot of a vs b. > > In this case, I probably need two versions of readDataset and > generateAandB, one for each of my dataset types. If I am using a > List[], I am probably ok with one plotAandB function. > > But how can I automate this solution? What if I want to read in 30 > datasets with 10 different dataset types? > > I can't do plotAandB[generateAandB[readDataset[#]]]& /@ > listOfDatasets, because I need 10 different generateAandBs and 10 > different readDatasets. > > How can I make my code "datatype aware"? > > I know this is a bit long winded - sorry about that. > > Thanks for any help, > Brian I am not sure I understand what your problem is here. Could you give a specific example of how you read and plot the data at the moment? If you just want a single function name that you can map to the list of datasets, you could carry a label together with the data, and instead of having functions generateAandB1, generateAandB2, ..., use generateAandB[{dataTypeOne, actualData_}] := (do something with actualData) generateAandB[{dataTypeTwo, actualData_}] := (do something else with actualData) (Or use the head of the expression as a label, if this is more suitable for your application: dataTypeOne[actualData]) Mathematica's symbolic expressions are very flexible. You can use them to represent many things. Do you have file names inside listOfDatasets? If you can determine the data format from the file name, you could use readDataset[filename_?dataFormatOneQ] := {dataTypeOne, (read the data)} etc. Here the function dataFormatOneQ returns True if the file contains data in the "first" format. If the file names are not enough to determine the format, map a function to listOfDatasets that reads into the files and labels all the file names. With this approach, you still have separate implementations for all your data formats, but now you have a single function that you can map to the list of file names. Does this help to solve the problem? Szabolcs