Re: Pattern matching
- To: mathgroup at smc.vnet.net
- Subject: [mg33965] Re: [mg33912] Pattern matching
- From: BobHanlon at aol.com
- Date: Wed, 24 Apr 2002 01:23:02 -0400 (EDT)
- Sender: owner-wri-mathgroup at wolfram.com
In a message dated 4/23/02 9:39:43 AM, leary at paradise.net.nz writes: >Can you help me please - there must be a simple solution to this problem, > >but I can't find it. > > From a list of character strings and a list of templates, I need to >produce a list of all strings that match any of the templates. For example: > >listData={"18K0F3C--" , "2K40GXX--" , "400HGXX--" , "5M00G1F--" , "960KG1D--"} >listTemplates={"???H?????" , "???K?????"} >result={"400HGXX--","960KG1D--"} > >In the templates, ? is a wild-card that represents a single character. >The data strings contain only alpha-numeric characters and hyphens - no > >other characters. >There are no special requirements for the result: duplication and random > >order are acceptable. > > >I searched the MathGroup archive and found a very useful function that >does >exactly what I want, but it works only on individual strings, not lists >of >strings (msg00051): > >QMMatchQ[s_String, p_String] := MatchQ[Characters[s], Characters[p] /. >"?" >-> _ ] > > > >I tried to use it in the following way, but the result is a list of the > >matching templates, not the matching strings : > >QMMatchQ[s_String, p_String] := MatchQ[Characters[s], Characters[p] /. >"?" >-> _ ] >SetOptions[Intersection, SameTest -> (QMMatchQ[#1,#2]& )]; >result=Intersection[listData,listTemplates] >{"???H?????","???K?????"} > > >It ought to be a small step from there to the result that I need, but I > >can't find a simple solution. > >One alternative approach would be a Do loop: > >b={}; >Do[b=Append[b,Select[listData,QMMatchQ[#,listTemplates[[n]]]&]],{n,1,Length[l istTemplates]}] > >This works but seems to be very slow for large lists. In the real case, > >listData can be very large - up to 250,000 elements - and the Do loop >approach doesn't seem to be optimum. listData={"18K0F3C--","2K40GXX--", "400HGXX--","5M00G1F--","960KG1D--"}; listTemplates={"???H?????","???K?????"}; Clear[QMMatchQ]; QMMatchQ[s_String,{p__String}]:=Or @@ (MatchQ[Characters[s], Characters[#]/."?"->_]& /@ {p}); Select[listData, QMMatchQ[#,listTemplates]&] {"400HGXX--", "960KG1D--"} or Cases[listData, _?(QMMatchQ[#,listTemplates]&)] {"400HGXX--", "960KG1D--"} Bob Hanlon Chantilly, VA USA