Re: parsing a strig
- To: mathgroup at smc.vnet.net
- Subject: [mg119849] Re: parsing a strig
- From: Bill Rowe <readnews at sbcglobal.net>
- Date: Sat, 25 Jun 2011 05:29:04 -0400 (EDT)
On 6/24/11 at 7:47 AM, tsariysk at craft-tech.com (Ted Sariyski) wrote: >Hi, I import a text file with Import[filename,"Table"]. The file has >a header followed by data. The header contains predefined keywords >like TITLE, ZONE, VARIABLES, etc. A VARIABLES line, shown below, is >a list of 'varname,varunits' pairs, which I need to extract as >pairs. >{{VARIABLES,=,'um','I_p,(W/sr/um)','I_a,(W/sr/um)','I_ae,(W/sr/um)', >'m','I_p,(W/sr/m)','I_a,(W/sr/m)','I_ae,(W/ sr/m)'}} >I tried StringSplit[varList,"'"] but got Out[]: >{{um,,,I_p},{(W/sr/um),,,I_a},{(W/sr/um),,,I_ae},...}}, which is >wrong. It looks like the line starting {{VARIABLES you posted is the result of the Import function, not the actual text in the file you want to parse. So, I copy and paste from your post and use Import to create the same list I think you got when doing Import, i.e. In[6]:= vars = Import[StringToStream[ "VARIABLES,=,'um','I_p,(W/sr/um)','I_a,(W/sr/um)','I_ae,(W/sr/um)',\ 'm','I_p,(W/sr/m)','I_a,(W/sr/m)','I_ae,(W/sr/m)'"], "CSV"] Out[6]= {{"VARIABLES", "=", "'um'", "'I_p", "(W/sr/um)'", "'I_a", "(W/sr/um)'", "'I_ae", "(W/sr/um)'", "'m'", "'I_p", "(W/sr/m)'", "'I_a", "(W/sr/m)'", "'I_ae", "(W/sr/m)'"}} Assuming I have correctly understood you post, then StringSplit isn't the right tool since there is not a single string. This will work In[7]:= StringReplace[vars[[1, 3 ;;]], "'" -> ""] Out[7]= {um,I_p,(W/sr/um),I_a,(W/sr/um),I_ae,(W/sr/um),m,I_p,(W/sr/m),I_a= ,(W/sr/m),I_ae,(W/sr/m)} But if what you posted was a single string then this does the trick In[9]:= StringSplit[ StringReplace[ "VARIABLES,=,'um','I_p,(W/sr/um)','I_a,(W/sr/um)','I_ae,(W/sr/um)',\ 'm','I_p,(W/sr/m)','I_a,(W/sr/m)','I_ae,(W/sr/m)'", "'" -> ""], ","][[3 ;;]] Out[9]= {um,I_p,(W/sr/um),I_a,(W/sr/um),I_ae,(W/sr/um),m,I_p,(W/sr/m),I_a= ,(W/sr/m),I_ae,(W/sr/m)} But I do wonder if either of these are truly correct. Except for the first um, it looks like there is a variable followed by units for that variable, i.e., I_p appears to be a variable in units of watts/stereradian/micrometer. If this interpretation is correct, the first um is most likely interpreted as units of micrometers and the actual variable name is missing. Quite frankly, if I were reading in data from a text file that was tagged with all of the data on one line following the tag, I would not use Import. Instead I would use FindList and then parse the strings returned by FindList. This will execute faster since unlike Import, FindList makes no attempt to interpret/parse what is being read in. I would also recommend using StringReplace to delete the underscore character. That has built in meaning in Mathematica and will cause problems if you were to use ToExpression to change the strings to symbols.
- Follow-Ups:
- Re: Improt vs Get
- From: Peter Breitfeld <phbrf@t-online.de>
- Re: Improt vs Get
- From: "Scot T. Martin" <smartin@seas.harvard.edu>
- Re: Improt vs Get
- From: DrMajorBob <btreat1@austin.rr.com>
- BinaryRead question
- From: Ted Sariyski <tsariysk@craft-tech.com>
- Re: Improt vs Get