MathGroup Archive 2007

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: XML data structure parsing in Mathematica 6 using patterns

  • To: mathgroup at smc.vnet.net
  • Subject: [mg81559] Re: [mg81502] XML data structure parsing in Mathematica 6 using patterns
  • From: DrMajorBob <drmajorbob at bigfoot.com>
  • Date: Wed, 26 Sep 2007 21:53:51 -0400 (EDT)
  • References: <21153656.1190833488110.JavaMail.root@m35>
  • Reply-to: drmajorbob at bigfoot.com

I can't WAIT to hear someone explain in what way the documentation on this  
is totally clear!

Until then, I can only observe that Help gives no examples in which (as in  
your attempts) PatternSequence is the left-hand side of a rule, or a  
pattern (in itself) to be matched. There are only examples where  
PatternSequence is inclosed in List or f. (Those are the only supported 
Heads, if I'm judging from Help alone, but I suppose f means "any head".)

Generalizing to your problem, however... try this:

p = {a___,
     PatternSequence[XMLElement["start-valid-time", _, {startT_}],
      XMLElement["end-valid-time", _, {endT_}]],
     b___} :> {a, {startT, endT}, b};
timeBlock //. p

XMLElement["time-layout", {"time-coordinate" -> "local",
   "summarization" -> "none"}, {XMLElement[
    "layout-key", {}, {"k-p12h-n14-1"}], {"2007-09-21T20:00:00-04:00",
    "2007-09-22T08:00:00-04:00"}, {"2007-09-22T08:00:00-04:00",
    "2007-09-22T20:00:00-04:00"}, {"2007-09-22T20:00:00-04:00",
    "2007-09-23T08:00:00-04:00"}, {"2007-09-23T08:00:00-04:00",
    "2007-09-23T20:00:00-04:00"}, {"2007-09-23T20:00:00-04:00",
    "2007-09-24T08:00:00-04:00"}, {"2007-09-24T08:00:00-04:00",
    "2007-09-24T20:00:00-04:00"}}]

You wanted ONLY the time-pairs, which may require something like

p = {___, a : ({_String, _String} ...),
     PatternSequence[XMLElement["start-valid-time", _, {startT_}],
      XMLElement["end-valid-time", _, {endT_}]],
     b___} :> {a, {startT, endT}, b};
timeBlock[[3]] //. p

{{"2007-09-21T20:00:00-04:00",
   "2007-09-22T08:00:00-04:00"}, {"2007-09-22T08:00:00-04:00",
   "2007-09-22T20:00:00-04:00"}, {"2007-09-22T20:00:00-04:00",
   "2007-09-23T08:00:00-04:00"}, {"2007-09-23T08:00:00-04:00",
   "2007-09-23T20:00:00-04:00"}, {"2007-09-23T20:00:00-04:00",
   "2007-09-24T08:00:00-04:00"}, {"2007-09-24T08:00:00-04:00",
   "2007-09-24T20:00:00-04:00"}}

All this seems to mean that PatternSequence[p1,p2] is not a pattern, since  
you can't match anything with it. But anyhead[PatternSequence[p1,p2]] IS a  
pattern.

Bobby

On Wed, 26 Sep 2007 05:38:18 -0500, Daniel Flatin <dflatin at rcn.com> wrote:

> This is my third attempt at posting this message. Apologies if somehow
> the first two got through. I saw no sign of them, however.
>
> I have an XML data structure where I want to extract the start and end
> times, for example:
>
> timeBlock  XMLElement["time-layout", {"time-coordinate" -> "local",
>   "summarization" -> "none"}, {XMLElement[
>    "layout-key", {}, {"k-p12h-n14-1"}],
>   XMLElement["start-valid-time", {}, {"2007-09-21T20:00:00-04:00"}],
>   XMLElement["end-valid-time", {}, {"2007-09-22T08:00:00-04:00"}],
>   XMLElement["start-valid-time", {}, {"2007-09-22T08:00:00-04:00"}],
>   XMLElement["end-valid-time", {}, {"2007-09-22T20:00:00-04:00"}],
>   XMLElement["start-valid-time", {}, {"2007-09-22T20:00:00-04:00"}],
>   XMLElement["end-valid-time", {}, {"2007-09-23T08:00:00-04:00"}],
>   XMLElement["start-valid-time", {}, {"2007-09-23T08:00:00-04:00"}],
>   XMLElement["end-valid-time", {}, {"2007-09-23T20:00:00-04:00"}],
>   XMLElement["start-valid-time", {}, {"2007-09-23T20:00:00-04:00"}],
>   XMLElement["end-valid-time", {}, {"2007-09-24T08:00:00-04:00"}],
>   XMLElement["start-valid-time", {}, {"2007-09-24T08:00:00-04:00"}],
>   XMLElement["end-valid-time", {}, {"2007-09-24T20:00:00-04:00"}]}]
> I can do this by finding all start times and all end times and then
> combining them, as in
>
> getTimeSequence[timeBlock_] :=
>  Module[{startBlock, endBlock, startT, endT},
>   startBlock = Cases[timeBlock,
>     XMLElement["start-valid-time", _, {startT_}] :> startT,
>     Infinity];
>   endBlock = Cases[timeBlock, XMLElement["end-valid-time", _, {endT_}]
> :> endT,
>     Infinity];
>   Transpose[{startBlock, endBlock}]
>   ]
> Philosophically, I think I should be able to capture the start and end
> times with a single pattern, but I can't make it work. For example I
> have tried:
>
> getTimeStartStopPairs[timeBlock_] :=
>  Module[{startBlock, endBlock, startT, endT},
>   Cases[
>    timeBlock,
>    PatternSequence[XMLElement["start-valid-time", _, {startT_}],
>      XMLElement["end-valid-time", _, {endT_}]] :> {startT, endT},
>    Infinity
>    ]
>   ]
>
> Does anyone have any suggestions? I would like to learn how to do this
> with just one pattern and I feel like I am misinterpreting how
> PatternSequence works.
>
> Thanks,
> Dan
>
>



-- 

DrMajorBob at bigfoot.com


  • Prev by Date: Course: New in Mathematica 6 - Amsterdam - October 16
  • Next by Date: Re: Good Introductory Text with a Physics Slant
  • Previous by thread: XML data structure parsing in Mathematica 6 using patterns
  • Next by thread: Re: XML data structure parsing in Mathematica 6 using patterns