MathGroup Archive 2002

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Pattern matching

  • To: mathgroup at smc.vnet.net
  • Subject: [mg33951] Re: Pattern matching
  • From: Jens-Peer Kuska <kuska at informatik.uni-leipzig.de>
  • Date: Wed, 24 Apr 2002 01:22:07 -0400 (EDT)
  • Organization: Universitaet Leipzig
  • References: <aa3fq4$7tv$1@smc.vnet.net>
  • Reply-to: kuska at informatik.uni-leipzig.de
  • Sender: owner-wri-mathgroup at wolfram.com

Hi,

> listData={"18K0F3C--" , "2K40GXX--" , "400HGXX--" , "5M00G1F--" , "960KG1D--"}
> listTemplates={"???H?????" , "???K?????"}
> result={"400HGXX--","960KG1D--"}
> 
> In the templates, ? is a wild-card that represents a single character.
> The data strings contain only alpha-numeric characters and hyphens - no
> other characters.
> There are no special requirements for the result:  duplication and random
> order are acceptable.
> 
> I searched the MathGroup archive and found a very useful function that does
> exactly what I want, but it works only on individual strings, not lists of
> strings (msg00051):
> 
> QMMatchQ[s_String, p_String] := MatchQ[Characters[s], Characters[p] /. "?"
> -> _ ]
> 
> I tried to use it in the following way, but the result is a list of the
> matching templates, not the matching strings :
> 
> QMMatchQ[s_String, p_String] := MatchQ[Characters[s], Characters[p] /. "?"
> -> _ ]
> SetOptions[Intersection, SameTest -> (QMMatchQ[#1,#2]& )];
> result=Intersection[listData,listTemplates]
> {"???H?????","???K?????"}
> 

That's not true on my SGI (Mathematica 4.1) , I get

{"400HGXX--", "960KG1D--"}

But in principle a intersection can't be asymmetric.

> It ought to be a small step from there to the result that I need, but I
> can't find a simple solution.
> 
> One alternative approach would be a Do loop:
> 
> b={};
> Do[b=Append[b,Select[listData,QMMatchQ[#,listTemplates[[n]]]&]],{n,1,Length[listTemplates]}]

Here is a solution without a explicit loop


Join @@ (Select[listData, 
          Function[{elem}, QMMatchQ[elem, #]]] & /@ listTemplates)


> 
> This works but seems to be very slow for large lists.  In the real case,
> listData can be very large - up to 250,000 elements - and the Do loop
> approach doesn't seem to be optimum.

I'm not sure that the Do[] loop slow down the pattern matching :-)

Regards
  Jens


  • Prev by Date: RE: Re: Closed Polygons from List
  • Next by Date: Re: Re: Row vs. Column Vectors (or Matrices)
  • Previous by thread: Re: Pattern matching
  • Next by thread: Re: Pattern matching