Re: A question about pattern matching

*To*: mathgroup at smc.vnet.net*Subject*: [mg21871] Re: [mg21837] A question about pattern matching*From*: Bojan Bistrovic <bojanb at physics.odu.edu>*Date*: Wed, 2 Feb 2000 22:54:28 -0500 (EST)*Sender*: owner-wri-mathgroup at wolfram.com

> Here's a question about pattern matching: > > When one has to work with unknown number of arguments, it's necessary to > use double or triple Blanks; now, here's an example in pattern matching: > > In[1]:= ReplaceList[ f[a,b,c,MyRule1->True,MyRule2->True,MyRule3->True], > f[aaa___, bbb___Rule] -> {{aaa},{bbb}}] > > Out[1]= {{{a,b,c},{MyRule1 -> True,MyRule2 -> True,MyRule3 -> True}}, > {{a,b,c, MyRule1 -> True},{MyRule2 -> True,MyRule3 -> True}}, > {{a,b,c, MyRule1 -> True,MyRule2 -> True},{MyRule3 -> True}}, > {{a,b,c, MyRule1 -> True,MyRule2 -> True, MyRule3 -> True},{}}} > > Now, when I started using them some time ago, it looked so logical that > in the example above the only match should be > > {{{a,b,c},{MyRule1 -> True,MyRule2 -> True,MyRule3 -> True}}} > > that I never bothered to check the manual. After a few weeks of infinite > loops I realized the error, so now I have to use something like > > In[2]:= ReplaceList[f[a,b,c,MyRule1->True,MyRule2->True,MyRule3->True], > f[aaa___ /; FreeQ[{aaa},_Rule,1], bbb___Rule] -> {{aaa},{bbb}}] > > Out[2]= {{{a,b,c},{MyRule1 -> True,MyRule2 -> True,MyRule3 -> True}}} > > to achieve the same effect. I realize that the present behavior of > Mathematica isn't completely nonlogical, and that one might want this > behavior sometimes, but it seems to me that it would be more logical to > have the behavior I just presented. The question is: what do you think > would be the more logical behavior: the one just described or the one > that's currently implemented? I propose a vote; post your answer or email it > directly to me and after a while, I'll post the results of the vote to > the group. [Please send your responses to the author - moderator.] > > Bojan > First of all, let me start by thanking everybody who took the trouble to answer my previous posting on this subject. I see that I haven't made my thoughts clear enough on this question, so I'll try to explain better. First, I'm using ReplaceList just for demonstrating purposes. The real question is how are patterns being matched. A few people have answered this post by saying that it's perfectly logical and well documented that if you have a set {a1,a2,a3,b1->b2,b3->b4, b5->b6} there's no requirement that AAA___ shouldn't match objects with the head Rule. I agree, it's perfectly legal and it's documented; it's just that from my viewpoint, if you're using a construct with two patterns AAA___,BBB___Rule, you WANT all the rules to be assigned to pattern BBB and all the other objects to AAA; if you wanted ALL the objects being assigned to AAA, you would simply use a single pattern AAA___ which would simply match everything. I have thought about it for a long, long time, and I simply couldn't imagine an example where you'd use two patterns like AAA___,BBB___Rule and WANT the rules to be matched by AAA as well as BBB that couldn't be handled by a single pattern AAA___. There's no reason to restrict this to rules (altho that's probably the most common usage of a triple-blank pattern at the end of a function); if you have a function with a (unknown) variable number of arguments and you want to match a group of arguments with the same head somewhere in the expression, the reason is that you want some operation performed on them and not on anything else, so you're not likely to want some other more general pattern to match those objects as well. I realize that this could be just a product of my twisted mind :-), but like I said, I can't think of an example where you'd want the present behavior regardless of it's perfectly valid logic. The way I see it, there should be a general (global) option that would control this behavior even if I'm wrong and there are situations where you want the present behavior. For speed if for nothing else; here's an example: if I use something like aaa___,bbb___Rule it produces (Length[bbb]+1 matches; now if you want that bbb___Rule matches ALL the rules and aaa___ NONE, just one survives regardless of the method you use for discarding the unwanted matches: In[1]:= Timing[ReplaceList[ f@@Union[Table[Random[],{kk,1000}], Table[Rule[Random[],Random[]],{kk,1000}]], f[aaa___,bbb___Rule] -> {{aaa},{bbb}}];] Out[1]= {1.8 Second,Null} In[2]:= Timing[ReplaceList[ f@@Union[Table[Random[],{kk,1000}], Table[Rule[Random[],Random[]],{kk,1000}]], f[aaa___/;FreeQ[{aaa},_Rule,1],bbb___Rule] -> {{aaa},{bbb}}];] Out[2]= {1.83 Second,Null} In[3]:= Timing[ReplaceList[ f@@Union[Table[Random[],{kk,1000}], Table[Rule[Random[],Random[]],{kk,1000}]], f[aaa___/;MemberQ[{aaa},_Rule],bbb___Rule] -> {{aaa},{bbb}}];] Out[3]= {3.03 Second,Null} In[4]:= Timing[ReplaceList[ f@@Union[Table[Random[],{kk,1000}], Table[Rule[Random[],Random[]],{kk,1000}]], f[aaa___?(Head[#] =!= Rule &),bbb___Rule] -> {{aaa},{bbb}}];] Out[4]= {16.63 Second,Null} In[5]:= Timing[ReplaceList[ f@@Union[Table[Random[],{kk,1000}], Table[Rule[Random[],Random[]],{kk,1000}]], f[aaa : (___?((Head[#] =!= Rule) &)) ,bbb___Rule] -> {{aaa},{bbb}}];] Out[5]= {16.64 Second,Null} In[6]:= Timing[ReplaceList[ f@@Union[Table[Random[],{kk,1000}], Table[Rule[Random[],Random[]],{kk,1000}]], f[aaa___ /; !Or @@ OptionQ /@ {aaa}, bbb___ /; OptionQ[{bbb}]] :> {{aaa}, Flatten[{bbb}]}];] Out[6]= {9.8 Second,Null} The use of ReplaceList is totally irrelevant; it's just for this example. Judging from the time needed to produce the result, it appears that all matches are still produced and then tested by FreeQ or something else. Obviously, FreeQ produces the fastest results; other ways of discarding the unwanted matches are even slower. In this example, there are 1001 matches and 1000 are discarded to get a single one. It looks obvious (altho it might not be) that if the pattern-matching mechanism stopped after finding the one match needed (which is by the way the first one) instead of finding other 1000 unwanted ones, a LOT of time would be saved. I don't know how it works internally, does Mathematica first finds the match we want or does it have some algorithm that finds them all and then sorts them. So, the time needed to find the wanted match might not be equal to the 1/1001 of total time, but I'm pretty sure it's not the LAST match found which is the only case where no time would be saved (for matching part). Even if it is, still, the time needed to write those unwanted matches somewhere in memory instead of just discarding them is still linear in number of matches. These examples are made under Linux with 64M of memory; for REALLY large expressions, if the operating systems runs out of memory (because it had to write down unwanted solutions) and starts memory paging, it makes the system completely inoperable for all practical purposes. When I try changing the number of rules from 1000 ro 2000 in the above examples, I have to kill the MathKernel because it has grown over 150M of memory an the system is using 99.99% of the time for memory swapping. As I said earlier, yes, it IS perfectly logical and perfectly legal the way it works now, but it wastes the time and memory, and I still haven't found an example where you HAVE to have the present behavior that couldn't be handled by either a single pattern aaa___ or by calling some function like Partition or something else. It appears that not many people have encountered this problem in performance drop, but it IS a problem, and a solvable one. Bye, Bojan -- ------------------------------------------------------------- Bojan Bistrovic, bojanb at physics.odu.edu Old Dominion University, Physics Department, Norfolk, VA -------------------------------------------------------------

**Re: Re: Problem with evaluation of delayed rules**

**Protein Folding**

**Re: Re: Problem with evaluation of delayed rules**

**Re: A question about pattern matching**