XML data structure parsing in Mathematica 6 using patterns
- To: mathgroup at smc.vnet.net
- Subject: [mg81502] XML data structure parsing in Mathematica 6 using patterns
- From: Daniel Flatin <dflatin at rcn.com>
- Date: Wed, 26 Sep 2007 06:38:18 -0400 (EDT)
This is my third attempt at posting this message. Apologies if somehow
the first two got through. I saw no sign of them, however.
I have an XML data structure where I want to extract the start and end
times, for example:
timeBlock = XMLElement["time-layout", {"time-coordinate" -> "local",
"summarization" -> "none"}, {XMLElement[
"layout-key", {}, {"k-p12h-n14-1"}],
XMLElement["start-valid-time", {}, {"2007-09-21T20:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-22T08:00:00-04:00"}],
XMLElement["start-valid-time", {}, {"2007-09-22T08:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-22T20:00:00-04:00"}],
XMLElement["start-valid-time", {}, {"2007-09-22T20:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-23T08:00:00-04:00"}],
XMLElement["start-valid-time", {}, {"2007-09-23T08:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-23T20:00:00-04:00"}],
XMLElement["start-valid-time", {}, {"2007-09-23T20:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-24T08:00:00-04:00"}],
XMLElement["start-valid-time", {}, {"2007-09-24T08:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-24T20:00:00-04:00"}]}]
I can do this by finding all start times and all end times and then
combining them, as in
getTimeSequence[timeBlock_] :=
Module[{startBlock, endBlock, startT, endT},
startBlock = Cases[timeBlock,
XMLElement["start-valid-time", _, {startT_}] :> startT,
Infinity];
endBlock = Cases[timeBlock, XMLElement["end-valid-time", _, {endT_}]
:> endT,
Infinity];
Transpose[{startBlock, endBlock}]
]
Philosophically, I think I should be able to capture the start and end
times with a single pattern, but I can't make it work. For example I
have tried:
getTimeStartStopPairs[timeBlock_] :=
Module[{startBlock, endBlock, startT, endT},
Cases[
timeBlock,
PatternSequence[XMLElement["start-valid-time", _, {startT_}],
XMLElement["end-valid-time", _, {endT_}]] :> {startT, endT},
Infinity
]
]
Does anyone have any suggestions? I would like to learn how to do this
with just one pattern and I feel like I am misinterpreting how
PatternSequence works.
Thanks,
Dan