XML parsing with patterns
- To: mathgroup at smc.vnet.net
- Subject: [mg81446] XML parsing with patterns
- From: Daniel Flatin <dflatin at rcn.com>
- Date: Sun, 23 Sep 2007 21:15:36 -0400 (EDT)
I have an XML data structure where I want to extract the start and end
times, for example:
timeBlock = XMLElement["time-layout", {"time-coordinate" -> "local",
"summarization" -> "none"}, {XMLElement[
"layout-key", {}, {"k-p12h-n14-1"}],
XMLElement["start-valid-time", {}, {"2007-09-21T20:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-22T08:00:00-04:00"}],
XMLElement["start-valid-time", {}, {"2007-09-22T08:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-22T20:00:00-04:00"}],
XMLElement["start-valid-time", {}, {"2007-09-22T20:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-23T08:00:00-04:00"}],
XMLElement["start-valid-time", {}, {"2007-09-23T08:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-23T20:00:00-04:00"}],
XMLElement["start-valid-time", {}, {"2007-09-23T20:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-24T08:00:00-04:00"}],
XMLElement["start-valid-time", {}, {"2007-09-24T08:00:00-04:00"}],
XMLElement["end-valid-time", {}, {"2007-09-24T20:00:00-04:00"}]}]
I can do this by finding all start times and all end times and then
combining them, as in
getTimeSequence[timeBlock_] :=
Module[{startBlock, endBlock, startT, endT},
startBlock = Cases[timeBlock,
XMLElement["start-valid-time", _, {startT_}] :> startT,
Infinity];
endBlock = Cases[timeBlock, XMLElement["end-valid-time", _, {endT_}]
:> endT,
Infinity];
Transpose[{startBlock, endBlock}]
]
Philosophically, I think I should be able to capture the start and end
times with a single pattern, but I can't make it work. For example I
have tried:
getTimeStartStopPairs[timeBlock_] :=
Module[{startBlock, endBlock, startT, endT},
Cases[
timeBlock,
PatternSequence[XMLElement["start-valid-time", _, {startT_}],
XMLElement["end-valid-time", _, {endT_}]] :> {startT, endT},
Infinity
]
]
Does anyone have any suggestions? I feel like I am misinterpreting how
PatternSequence works.
Thanks,
Dan