Re: Re: Need a nice way to do the "left count"
- To: mathgroup at smc.vnet.net
- Subject: [mg40311] Re: [mg40303] Re: Need a nice way to do the "left count"
- From: Dr Bob <majort at cox-internet.com>
- Date: Mon, 31 Mar 2003 04:01:12 -0500 (EST)
- References: <C3A16B6A-6334-11D7-B455-00039311C1CC@mimuw.edu.pl>
- Reply-to: majort at cox-internet.com
- Sender: owner-wri-mathgroup at wolfram.com
Andrzej is quite right, Map is more natural here. He forgot to remove 'Rest', however. It should be: drbob3[s_List] := Block[{count, n = 0}, count[any_] := 0; Map[++n - (++count[#]) &, s]] or drbob4[s_List] := Block[{count}, count[any_] := 0; Range@Length@s - Map[++count[#] &, s]] Much to my surprise, drbob and drbob4 have almost identical timings for 100 values in a list: n = 100000; test = Array[Round[99Random[]] &, n]; Timing[drbob[test];] Timing[drbob2[test];] Timing[drbob3[test];] Timing[drbob4[test];] {1.093999999999994*Second, Null} {1.718999999999994*Second, Null} {1.7659999999999911*Second, Null} {1.0930000000000177*Second, Null} But drbob4 beats drbob (slightly) for 21 values: n = 100000; test = Array[Round[20Random[]] &, n]; Timing[drbob[test];] Timing[drbob2[test];] Timing[drbob3[test];] Timing[drbob4[test];] {0.7970000000000255*Second, Null} {1.4530000000000314*Second, Null} {1.3900000000000432*Second, Null} {0.7810000000000059*Second, Null} and for binary lists: n = 100000; test = Array[Random[Integer] &, n]; Timing[drbob[test];] Timing[drbob2[test];] Timing[drbob3[test];] Timing[drbob4[test];] Timing[andrzej2[test];] {0.8440000000000225*Second, Null} {1.421999999999997*Second, Null} {1.421999999999997*Second, Null} {0.7649999999999864*Second, Null} {0.5310000000000059*Second, Null} Bobby On Mon, 31 Mar 2003 13:53:50 +0900, Andrzej Kozlowski <akoz at mimuw.edu.pl> wrote: > I agree that Bob's algorithm is the simplest and most natural of all > except that his use of FoldList seems to me to be intended to mystify > and confuse ;-) > > In fact > > drbob3[s_List] := Block[{count, n = 0}, > count[any_] := 0; > Rest@Map[++n - (++count[#]) &, s] > ] > > does the same thing as drbob2 and is, at least on my machine, slightly > faster. > > As for the algorithm, it is essentially exactly what you would do if you > had to solve this problem by hand. You simply count how many times an > element in the n-th place has occurred and subtract it from n. The only > weakness of this algorithm is that when you get a repetition like > {b,a,a...} we know that the number of non a's before the second a is the > same as the number of non a's before the first a, so there is no need to > count but the algorithm wastes a tiny amount of time on recounting this > number again. > > My algorithm for two distinct values avoids this problem. It exploits the > fact that to solve the problem for just two distinct values you only need > to know: > > In[23]:= > Length/@Split[{a,b,b,a,a,a,b,a,b,a,a}] > > Out[23]= > {1,2,3,1,1,1,2} > > The numbers you see give you the number of repeats of the same value you > are going to see in the final list. You can also (here is the point where > the difference between two value case and the general case becomes > significant) work out what these values are going to be. The number you > are going to get in the k-th position, where k is even will be the sum of > the numbers in all the odd positions preceding k, and the value you get > in the k-t place when k is odd the sum of all the values in the even > positions preceding k. > Since my code depends on this fact it obviously can't be expected to work > with more than 2 values. > > Andrzej Kozlowski > Yokohama, Japan > http://www.mimuw.edu.pl/~akoz/ > http://platon.c.u-tokyo.ac.jp/andrzej/ > > > On Monday, March 31, 2003, at 10:14 am, Dr Bob wrote: > >> I actually think my code is the easiest to understand of them all, but I >> suppose every programmer thinks that! >> >> I had forgotten to time my other code, so here it is: >> >> drbob2[s_List] := Block[{count, n = 0}, >> count[any_] := 0; >> Rest@FoldList[++n - (++count[#2]) &, Null, s] >> ] >> >> (Note that the second argument of FoldList, in both my solutions, is not >> used.) >> >> I timed Selwyn's new solution, and the four best (for more than 2 >> values) >> are now: >> >> n = 10000; >> test = Array[Round[99Random[]] &, n]; >> Timing[fisher[test];] >> Timing[selwyn2[test];] >> Timing[drbob[test];] >> Timing[drbob2[test];] >> >> {2.171999999999997*Second, Null} >> {0.7180000000000035*Second, Null} >> {0.10999999999999943*Second, Null} >> {0.17199999999999704*Second, Null} >> >> Bobby >> >> On Sun, 30 Mar 2003 11:38:29 -0500, Selwyn Hollis >> <selwynh at earthlink.net> >> wrote: >> >>> For what it's worth, I get a 4X speed-up by not using PrependTo, and a >>> minor improvement by using Block instead of Module within Scan: >>> >>> countdiffs[s_List] := Module[{members, totals, g, j}, >>> members = Union[s]; >>> totals = Count[s, #] & /@ members; >>> j = Length[s]; >>> Scan[Block[{i = First@First@Position[members, #]}, >>> g[j] = j - totals[[i]]; >>> totals[[i]]--; j--] &, Reverse[s]]; >>> Table[g[i], {i, Length[s]}]] >>> >>> But it's still much slower than drBob's brilliantly incomprehensible >>> code >>> :^) >>> >>> --- >>> Selwyn >>> >>> >>> On Saturday, March 29, 2003, at 08:13 PM, Dr Bob wrote: >>> >>>> Here are timings with a binary-valued list: >>>> >>>> n = 10000; >>>> test = Array[Random[Integer] &, n]; >>>> Timing[gray[test];] >>>> Timing[hanlon[test];] >>>> Timing[andrzej1[test];] >>>> Timing[fisher[test];] >>>> Timing[selwyn[test];] >>>> Timing[drbob[test];] >>>> Timing[andrzej2[test];] >>>> >>>> {36.547*Second, Null} >>>> {12.406000000000006*Second, Null} >>>> {19.437000000000012*Second, Null} >>>> {0.29700000000002547*Second, Null} >>>> {3.7819999999999823*Second, Null} >>>> {0.07800000000003138*Second, Null} >>>> {0.06200000000001182*Second, Null} >>>> >>>> Neither 'gray' nor 'andrzej2' can handle more than two values in a >>>> list, >>>> and they give different answers in that case, while the others agree. >>>> (They all agree on binary lists.) >>>> >>>> n = 1000; >>>> test = Array[Round[3*Random[]] &, n]; >>>> andrzej1[test] == fisher[test] == selwyn[test] == hanlon[test] == >>>> drbob[test] >>>> gray[test] == drbob[test] >>>> andrzej2[test] == drbob[test] >>>> gray[test] == andrzej2[test] >>>> >>>> True >>>> False >>>> False >>>> False >>>> >>>> Here are timings on a list with 21 different values: >>>> >>>> n = 10000; >>>> test = Array[Round[20Random[]] &, n]; >>>> Union[test] >>>> Timing[hanlon[test];] >>>> Timing[andrzej1[test];] >>>> Timing[fisher[test];] >>>> Timing[selwyn[test];] >>>> Timing[drbob[test];] >>>> >>>> {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, >>>> 20} >>>> {13.078000000000001*Second, Null} >>>> {15.563000000000002*Second, Null} >>>> {0.5779999999999959*Second, Null} >>>> {2.9849999999999994*Second, Null} >>>> {0.07800000000000296*Second, Null} >>>> >>>> and here are timings with 100 different values: >>>> >>>> n = 10000; >>>> test = Array[Round[99Random[]] &, n]; >>>> Timing[hanlon[test];] >>>> Timing[andrzej1[test];] >>>> Timing[fisher[test];] >>>> Timing[selwyn[test];] >>>> Timing[drbob[test];] >>>> >>>> {12.969000000000001*Second, Null} >>>> {15.765*Second, Null} >>>> {2.2039999999999935*Second, Null} >>>> {3.6400000000000006*Second, Null} >>>> {0.09399999999999409*Second, Null} >>>> >>>> 'fisher' seems badly affected by increasing the number of values. >>>> >>>> Bobby >>>> >>>> On Sun, 30 Mar 2003 08:00:57 +0900, Andrzej Kozlowski >>>> <akoz at mimuw.edu.pl> wrote: >>>> >>>>> If any one is interested in doing performance tests I suggest using >>>>> my >>>>> other function, which I also posted to the Mathgroup. The two >>>>> functions >>>>> were meant to demonstrate the difference between elegance (which I >>>>> identify with shortness of code) and efficiency. Here is the >>>>> efficient >>>>> one (the algorithm works only for lists containing just two distinct >>>>> symbols): >>>>> >>>>> f2[l_List] := Module[{mult = >>>>> Length /@ Split[l], list1, >>>>> list2, values}, list1 = Rest[FoldList[Plus, 0, Table[mult[[i]], {i, >>>>> 1, >>>>> Length[mult], >>>>> 2}]]]; list2 = >>>>> FoldList[Plus, >>>>> 0, Table[mult[[i]], {i, 2, 2( >>>>> Floor[(Length[mult] + 1)/2]) - 1, 2}]]; >>>>> values = Take[Flatten[Transpose[{list2, >>>>> list1}]], Length[mult]]; Flatten[Table[Table[values[[i]], { >>>>> mult[[i]]}], {i, 1, Length[mult]}]]] >>>>> >>>>> >>>>> >>>>> On Sunday, March 30, 2003, at 07:12 am, Steve Gray wrote: >>>>> >>>>>> Hello all, >>>>>> I am again gratified for all the prompt, more-than-competent >>>>>> solutions to my problem. I'm enclosing a .nb file with all replies >>>>>> received so far so you can compare them. I did not do timing because >>>>>> the lists I have are relatively short and I don't need to call this >>>>>> function very often. >>>>>> Thanks again. >>>>>> >>>>>> Steve Gray >>>>>> >>>>>> ----- Original Message ----- >>>>>> From: BobHanlon at aol.com To: mathgroup at smc.vnet.net >> To: mathgroup at smc.vnet.net >>>>>> To: stevebg at adelphia.net >>>>>> Sent: Saturday, March 29, 2003 4:53 AM >>>>>> Subject: [mg40311] [mg40303] Re: [mg40279] Need a nice way to do this >>>>>> >>>>>> In a message dated 3/29/03 6:11:39 AM, stevebg at adelphia.net writes: >>>>>> >>>>>> Given a list consisting of only two distinct values, such as >>>>>> s={a,b,b,a,a,a,b,a,b,a,a}, I want to derive a list of equal length >>>>>> g={0,1,1,2,2,2,4,3,5,4,4}. The rule is: for each position >>>>>> 1<=p<=Length[s], look at list s and set g[[p]] to the number of >>>>>> elements in s to the left of p which are not equal to s[[p]]. >>>>>> In a more general version, which I do not need now, s would >>>>>> not be restricted to only two distinct values. >>>>>> >>>>>> Thank you for any ideas, including other applications where >>>>>> this particular calculation is used. The current application is an >>>>>> unusual conjecture in geometry. >>>>>> >>>>>> >>>>>> <Left-Count.nb> >>>>> Andrzej Kozlowski >>>>> Yokohama, Japan >>>>> http://www.mimuw.edu.pl/~akoz/ >>>>> http://platon.c.u-tokyo.ac.jp/andrzej/ >>>>> >>>>> >>>> >>>> >>>> >>>> -- majort at cox-internet.com >>>> Bobby R. Treat<Left-Count.nb> >>> >>> >> >> >> > > -- majort at cox-internet.com Bobby R. Treat