MathGroup Archive 2013

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Importing a file and extracting data

  • To: mathgroup at
  • Subject: [mg131162] Re: Importing a file and extracting data
  • From: David Bailey <dave at>
  • Date: Sat, 15 Jun 2013 04:20:48 -0400 (EDT)
  • Delivered-to:
  • Delivered-to:
  • Delivered-to:
  • Delivered-to:
  • References: <kpeln9$htb$>

On 14/06/2013 09:54, howardfink at wrote:
> I have a series of files of this form:
> June 7, 2013
> Tc+Naphthalene  vs Temperature (oC)
> Run 1
> Tc_Naph_84_2C			3.740 ns
> Tc_Naph_87_1C			3.731 ns
> Tc_Naph_89_9C			3.720 ns
> Tc_Naph_92_9C			3.704 ns
> Tc_Naph_94_7C			3.687 ns
> Tc_Naph_97_6C			3.694 ns
> Run 2
> Tc_Naph_83_2C			3.758 ns
> Tc_Naph_83_4C			3.750 ns
> Tc_Naph_86_4C			3.728 ns
> Tc_Naph_88_1C			3.725 ns
> Tc_Naph_90_2C			3.716 ns
> Tc_Naph_93_1C			3.704 ns
> Tc_Naph_94_7C			3.673 ns
> Tc_Naph_97_7C			3.684 ns
> Tc_Naph_97_9C			3.665 ns
> I used an Import command to read in the file, but now I am just sitting and=
>   staring, without a clue how to get the 84_2 converted to the number 84.2,e=
> tc. and ending up with two lists: Run 1 and Run 2, consisting of pairs of t=
> emperature and time.  The temperature will eventually be converted to 1/abs=
> olute temperature.
> I've read lots and lots of help, thumbed through dozens of pages of a Mathe=
> matica 5 manual, and don't know where to start.  I'm trying to help a 90-ye=
> ar-old chemistry professor, who is currently using a calculator, but  there=
>   will be dozens of runs of this experiment.
Import is designed to read text or binary data formatted in a standard 
form - e.g. CSV. Clearly your files have an ad-hoc format, so you can't 
expect Mathematica (or anything else) to read them without some effort!

The job is complicated by the fact that you (or the prof) used "_" 
rather than a decimal point, and that the number is joined on to other 
textual data. I am going to assume that the units (C and ns) are the 
same throughout, and can be discarded, and that all the temperatures 
have a decimal part, even if it is 0.

First define a couple of functions:

dataConvert[{{run_}, samples_}] := {run, Map[process, samples]};

process[line_] := Module[{tmp},
   tmp = StringReplace[
     "\"" <> line, {"_" ~~ a : (DigitCharacter ..) ~~ "_" ~~
        b : (DigitCharacter ..) ~~ "C" :>
       "\"," <> a <> "." <> b <> ",", " ns" :> ""}];
   ToExpression["{" <> tmp <> "}"]

It is best to avoid Import if the data is not in a recognised format, 
and to read it in as strings, discarding the empty lines, and extracting 
the data in the first two lines:

In[4]:= data = ReadList["c:\\maths\\data.dat", String];

In[7]:= data = DeleteCases[data, {}];

In[5]:= fileDate = data[[1]]

Out[5]= "June 7, 2013      "

In[9]:= fileTitle = data[[2]]

Out[9]= "Tc+Naphthalene  vs Temperature (oC)"

Break up the rest by detecting the 'Run' lines:
In[29]:= tmp =
  Partition[SplitBy[data[[3 ;;]], StringMatchQ[#, "Run" ~~ ___] &], 2]

Out[29]= {{{"Run 1"}, {"Tc_Naph_84_2C			3.740 ns",
    "Tc_Naph_87_1C			3.731 ns", "Tc_Naph_89_9C			3.720 ns",
    "Tc_Naph_92_9C			3.704 ns", "Tc_Naph_94_7C			3.687 ns",
    "Tc_Naph_97_6C			3.694 ns"}}, {{"Run 2"}, {"Tc_Naph_83_2C			3.758 \
ns", "Tc_Naph_83_4C			3.750 ns", "Tc_Naph_86_4C			3.728 ns",
    "Tc_Naph_88_1C			3.725 ns", "Tc_Naph_90_2C			3.716 ns",
    "Tc_Naph_93_1C			3.704 ns", "Tc_Naph_94_7C			3.673 ns",
    "Tc_Naph_97_7C			3.684 ns", "Tc_Naph_97_9C			3.665 ns"}}}

Now apply the previous functions to produce a nested list structure of 
strings and real numbers:

In[38]:= Map[dataConvert, tmp]

Out[38]= {{"Run 1", {{"Tc_Naph", 84.2, 3.74}, {"Tc_Naph", 87.1,
     3.731}, {"Tc_Naph", 89.9, 3.72}, {"Tc_Naph", 92.9,
     3.704}, {"Tc_Naph", 94.7, 3.687}, {"Tc_Naph", 97.6,
     3.694}}}, {"Run 2", {{"Tc_Naph", 83.2, 3.758}, {"Tc_Naph", 83.4,
     3.75}, {"Tc_Naph", 86.4, 3.728}, {"Tc_Naph", 88.1,
     3.725}, {"Tc_Naph", 90.2, 3.716}, {"Tc_Naph", 93.1,
     3.704}, {"Tc_Naph", 94.7, 3.673}, {"Tc_Naph", 97.7,
     3.684}, {"Tc_Naph", 97.9, 3.665}}}}

Clearly it is best in future to record data in an easier format - 
whatever language you use to process it!

You may want to lookup StringReplace and StringExpression to understand 
the above, and help you with other problems of this  type.

David Bailey

  • Prev by Date: Re: Importing a file and extracting data
  • Next by Date: Re: Importing a file and extracting data
  • Previous by thread: Re: Importing a file and extracting data
  • Next by thread: Summer School on Bio-Inspired Computing using Mathematica, Shenyang,