Services & Resources / Wolfram Forums
-----
 /
MathGroup Archive
2003
*January
*February
*March
*April
*May
*June
*July
*August
*September
*October
*November
*December
*Archive Index
*Ask about this page
*Print this page
*Give us feedback
*Sign up for the Wolfram Insider

MathGroup Archive 2003

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: Need a nice way to do the "left count"

  • To: mathgroup at smc.vnet.net
  • Subject: [mg40309] Re: [mg40303] Re: Need a nice way to do the "left count"
  • From: Andrzej Kozlowski <akoz at mimuw.edu.pl>
  • Date: Mon, 31 Mar 2003 04:01:04 -0500 (EST)
  • Sender: owner-wri-mathgroup at wolfram.com

I agree that Bob's algorithm is the simplest and most natural of all 
except that his use of  FoldList seems to me to be intended to mystify 
and confuse ;-)

In fact

drbob3[s_List] := Block[{count, n = 0},
     count[any_] := 0;
     Rest@Map[++n - (++count[#]) &,  s]
     ]

does the same thing as drbob2 and is, at least on my machine, slightly 
faster.

As for the algorithm, it is essentially exactly what you would do if 
you had to solve this problem by hand. You simply count how many times 
an element in the n-th place has occurred and subtract it from n. The 
only weakness of this algorithm is that when you get a repetition like 
{b,a,a...} we know that the number of non a's before the second a is 
the same as the number of non a's before the first a, so there is no 
need to count but the algorithm wastes a tiny amount of time on 
recounting this number again.

My algorithm for two distinct values avoids this problem. It exploits 
the fact that to solve the problem for just two distinct values you 
only need to know:

In[23]:=
Length/@Split[{a,b,b,a,a,a,b,a,b,a,a}]

Out[23]=
{1,2,3,1,1,1,2}

The numbers you see give you the number of repeats of the same value 
you are going to see in the final list. You can also (here is the point 
where the difference between two value case and the general case 
becomes significant) work out what these values are going to be.  The 
number you are going to get in the k-th position, where k is even will 
be the sum of the numbers in all the odd positions preceding k, and the 
value you get in the k-t place when k is odd the sum of all the values 
in the even positions preceding k.
Since my code depends on this fact it obviously can't be expected to 
work with more than 2 values.

Andrzej Kozlowski
Yokohama, Japan
http://www.mimuw.edu.pl/~akoz/
http://platon.c.u-tokyo.ac.jp/andrzej/


On Monday, March 31, 2003, at 10:14  am, Dr Bob wrote:

> I actually think my code is the easiest to understand of them all, but 
> I
> suppose every programmer thinks that!
>
> I had forgotten to time my other code, so here it is:
>
> drbob2[s_List] := Block[{count, n = 0},
>     count[any_] := 0;
>     Rest@FoldList[++n - (++count[#2]) &, Null, s]
>     ]
>
> (Note that the second argument of FoldList, in both my solutions, is 
> not
> used.)
>
> I timed Selwyn's new solution, and the four best (for more than 2 
> values)
> are now:
>
> n = 10000;
> test = Array[Round[99Random[]] &, n];
> Timing[fisher[test];]
> Timing[selwyn2[test];]
> Timing[drbob[test];]
> Timing[drbob2[test];]
>
> {2.171999999999997*Second,   Null}
> {0.7180000000000035*Second,  Null}
> {0.10999999999999943*Second,  Null}
> {0.17199999999999704*Second,  Null}
>
> Bobby
>
> On Sun, 30 Mar 2003 11:38:29 -0500, Selwyn Hollis 
> <selwynh at earthlink.net>
> wrote:
>
>> For what it's worth, I get a 4X speed-up by not using PrependTo, and a
>> minor improvement by using Block instead of Module within Scan:
>>
>> countdiffs[s_List] := Module[{members, totals, g, j},
>> members = Union[s];
>> totals = Count[s, #] & /@ members;
>> j = Length[s];
>> Scan[Block[{i = First@First@Position[members, #]},
>> g[j] = j - totals[[i]];
>> totals[[i]]--; j--] &, Reverse[s]];
>> Table[g[i], {i, Length[s]}]]
>>
>> But it's still much slower than drBob's brilliantly incomprehensible 
>> code
>> :^)
>>
>> ---
>> Selwyn
>>
>>
>> On Saturday, March 29, 2003, at 08:13  PM, Dr Bob wrote:
>>
>>> Here are timings with a binary-valued list:
>>>
>>> n = 10000;
>>> test = Array[Random[Integer] &, n];
>>> Timing[gray[test];]
>>> Timing[hanlon[test];]
>>> Timing[andrzej1[test];]
>>> Timing[fisher[test];]
>>> Timing[selwyn[test];]
>>> Timing[drbob[test];]
>>> Timing[andrzej2[test];]
>>>
>>> {36.547*Second, Null}
>>> {12.406000000000006*Second,   Null}
>>> {19.437000000000012*Second,   Null}
>>> {0.29700000000002547*Second,  Null}
>>> {3.7819999999999823*Second,   Null}
>>> {0.07800000000003138*Second,  Null}
>>> {0.06200000000001182*Second,  Null}
>>>
>>> Neither 'gray' nor 'andrzej2' can handle more than two values in a 
>>> list,
>>> and they give different answers in that case, while the others agree.
>>> (They all agree on binary lists.)
>>>
>>> n = 1000;
>>> test = Array[Round[3*Random[]] &, n];
>>> andrzej1[test] == fisher[test] == selwyn[test] == hanlon[test] ==
>>> drbob[test]
>>> gray[test] == drbob[test]
>>> andrzej2[test] == drbob[test]
>>> gray[test] == andrzej2[test]
>>>
>>> True
>>> False
>>> False
>>> False
>>>
>>> Here are timings on a list with 21 different values:
>>>
>>> n = 10000;
>>> test = Array[Round[20Random[]] &, n];
>>> Union[test]
>>> Timing[hanlon[test];]
>>> Timing[andrzej1[test];]
>>> Timing[fisher[test];]
>>> Timing[selwyn[test];]
>>> Timing[drbob[test];]
>>>
>>> {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 
>>> 19,
>>> 20}
>>> {13.078000000000001*Second,   Null}
>>> {15.563000000000002*Second,   Null}
>>> {0.5779999999999959*Second,   Null}
>>> {2.9849999999999994*Second,   Null}
>>> {0.07800000000000296*Second,  Null}
>>>
>>> and here are timings with 100 different values:
>>>
>>> n = 10000;
>>> test = Array[Round[99Random[]] &, n];
>>> Timing[hanlon[test];]
>>> Timing[andrzej1[test];]
>>> Timing[fisher[test];]
>>> Timing[selwyn[test];]
>>> Timing[drbob[test];]
>>>
>>> {12.969000000000001*Second,   Null}
>>> {15.765*Second, Null}
>>> {2.2039999999999935*Second,   Null}
>>> {3.6400000000000006*Second,   Null}
>>> {0.09399999999999409*Second,  Null}
>>>
>>> 'fisher' seems badly affected by increasing the number of values.
>>>
>>> Bobby
>>>
>>> On Sun, 30 Mar 2003 08:00:57 +0900, Andrzej Kozlowski
>>> <akoz at mimuw.edu.pl> wrote:
>>>
>>>> If any one is interested in doing performance tests I suggest using 
>>>> my
>>>> other function, which I also posted to the Mathgroup. The two 
>>>> functions
>>>> were meant to demonstrate the difference between elegance (which I
>>>> identify with shortness of code)  and efficiency. Here is the 
>>>> efficient
>>>> one (the algorithm works only for lists containing just two distinct
>>>> symbols):
>>>>
>>>> f2[l_List] := Module[{mult =
>>>> Length /@ Split[l], list1,
>>>> list2, values}, list1 = Rest[FoldList[Plus, 0, Table[mult[[i]], {i, 
>>>> 1,
>>>> Length[mult],
>>>> 2}]]]; list2 =
>>>> FoldList[Plus,
>>>> 0, Table[mult[[i]], {i, 2, 2(
>>>> Floor[(Length[mult] + 1)/2]) - 1, 2}]];
>>>> values = Take[Flatten[Transpose[{list2,
>>>> list1}]], Length[mult]]; Flatten[Table[Table[values[[i]], {
>>>> mult[[i]]}], {i, 1, Length[mult]}]]]
>>>>
>>>>
>>>>
>>>> On Sunday, March 30, 2003, at 07:12  am, Steve Gray wrote:
>>>>
>>>>> Hello all,
>>>>>     I am again gratified for all the prompt, more-than-competent
>>>>> solutions to my problem. I'm enclosing a .nb file with all replies
>>>>> received so far so you can compare them. I did not do timing 
>>>>> because
>>>>> the lists I have are relatively short and I don't need to call this
>>>>> function very often.
>>>>>     Thanks again.
>>>>>  
>>>>> Steve Gray
>>>>>
>>>>> ----- Original Message -----
>>>>> From: BobHanlon at aol.com
To: mathgroup at smc.vnet.net
> To: mathgroup at smc.vnet.net
>>>>> To: stevebg at adelphia.net
>>>>> Sent: Saturday, March 29, 2003 4:53 AM
>>>>> Subject: [mg40309] [mg40303] Re: [mg40279] Need a nice way to do this
>>>>>
>>>>> In a message dated 3/29/03 6:11:39 AM, stevebg at adelphia.net writes:
>>>>>
>>>>>     Given a list consisting of only two distinct values, such as
>>>>> s={a,b,b,a,a,a,b,a,b,a,a}, I want to derive a list of equal length
>>>>> g={0,1,1,2,2,2,4,3,5,4,4}. The rule is: for each position
>>>>> 1<=p<=Length[s], look at list s and set g[[p]] to the number of
>>>>> elements in s to the left of p which are not equal to s[[p]].
>>>>>     In a more general version, which I do  not need now, s would
>>>>> not be restricted to only two distinct values.
>>>>>
>>>>>     Thank you for any ideas, including other applications where
>>>>> this particular calculation is used. The current application is an
>>>>> unusual conjecture in geometry.
>>>>>
>>>>>
>>>>> <Left-Count.nb>
>>>> Andrzej Kozlowski
>>>> Yokohama, Japan
>>>> http://www.mimuw.edu.pl/~akoz/
>>>> http://platon.c.u-tokyo.ac.jp/andrzej/
>>>>
>>>>
>>>
>>>
>>>
>>> -- majort at cox-internet.com
>>> Bobby R. Treat<Left-Count.nb>
>>
>>
>
>
>



  • Prev by Date: Re: Need a nice way to do this
  • Next by Date: Re: Need a nice way to do this
  • Previous by thread: Re: Need a nice way to do the "left count"
  • Next by thread: Re: Re: Need a nice way to do the "left count"