XML data structure parsing in Mathematica 6 using patterns
- To: mathgroup at smc.vnet.net
- Subject: [mg81502] XML data structure parsing in Mathematica 6 using patterns
- From: Daniel Flatin <dflatin at rcn.com>
- Date: Wed, 26 Sep 2007 06:38:18 -0400 (EDT)
This is my third attempt at posting this message. Apologies if somehow the first two got through. I saw no sign of them, however. I have an XML data structure where I want to extract the start and end times, for example: timeBlock = XMLElement["time-layout", {"time-coordinate" -> "local", "summarization" -> "none"}, {XMLElement[ "layout-key", {}, {"k-p12h-n14-1"}], XMLElement["start-valid-time", {}, {"2007-09-21T20:00:00-04:00"}], XMLElement["end-valid-time", {}, {"2007-09-22T08:00:00-04:00"}], XMLElement["start-valid-time", {}, {"2007-09-22T08:00:00-04:00"}], XMLElement["end-valid-time", {}, {"2007-09-22T20:00:00-04:00"}], XMLElement["start-valid-time", {}, {"2007-09-22T20:00:00-04:00"}], XMLElement["end-valid-time", {}, {"2007-09-23T08:00:00-04:00"}], XMLElement["start-valid-time", {}, {"2007-09-23T08:00:00-04:00"}], XMLElement["end-valid-time", {}, {"2007-09-23T20:00:00-04:00"}], XMLElement["start-valid-time", {}, {"2007-09-23T20:00:00-04:00"}], XMLElement["end-valid-time", {}, {"2007-09-24T08:00:00-04:00"}], XMLElement["start-valid-time", {}, {"2007-09-24T08:00:00-04:00"}], XMLElement["end-valid-time", {}, {"2007-09-24T20:00:00-04:00"}]}] I can do this by finding all start times and all end times and then combining them, as in getTimeSequence[timeBlock_] := Module[{startBlock, endBlock, startT, endT}, startBlock = Cases[timeBlock, XMLElement["start-valid-time", _, {startT_}] :> startT, Infinity]; endBlock = Cases[timeBlock, XMLElement["end-valid-time", _, {endT_}] :> endT, Infinity]; Transpose[{startBlock, endBlock}] ] Philosophically, I think I should be able to capture the start and end times with a single pattern, but I can't make it work. For example I have tried: getTimeStartStopPairs[timeBlock_] := Module[{startBlock, endBlock, startT, endT}, Cases[ timeBlock, PatternSequence[XMLElement["start-valid-time", _, {startT_}], XMLElement["end-valid-time", _, {endT_}]] :> {startT, endT}, Infinity ] ] Does anyone have any suggestions? I would like to learn how to do this with just one pattern and I feel like I am misinterpreting how PatternSequence works. Thanks, Dan