Services & Resources / Wolfram Forums
-----
 /
MathGroup Archive
2003
*January
*February
*March
*April
*May
*June
*July
*August
*September
*October
*November
*December
*Archive Index
*Ask about this page
*Print this page
*Give us feedback
*Sign up for the Wolfram Insider

MathGroup Archive 2003

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: Need a nice way to do the "left count"

  • To: mathgroup at smc.vnet.net
  • Subject: [mg40311] Re: [mg40303] Re: Need a nice way to do the "left count"
  • From: Dr Bob <majort at cox-internet.com>
  • Date: Mon, 31 Mar 2003 04:01:12 -0500 (EST)
  • References: <C3A16B6A-6334-11D7-B455-00039311C1CC@mimuw.edu.pl>
  • Reply-to: majort at cox-internet.com
  • Sender: owner-wri-mathgroup at wolfram.com

Andrzej is quite right, Map is more natural here.  He forgot to remove 
'Rest', however.

It should be:

drbob3[s_List] := Block[{count, n = 0},
    count[any_] := 0;
    Map[++n - (++count[#]) &, s]]

or

drbob4[s_List] := Block[{count},
    count[any_] := 0;
    Range@Length@s - Map[++count[#] &, s]]

Much to my surprise, drbob and drbob4 have almost identical timings for 100 
values in a list:

n = 100000;
test = Array[Round[99Random[]] &, n];
Timing[drbob[test];]
Timing[drbob2[test];]
Timing[drbob3[test];]
Timing[drbob4[test];]

{1.093999999999994*Second,   Null}
{1.718999999999994*Second,   Null}
{1.7659999999999911*Second,  Null}
{1.0930000000000177*Second,  Null}

But drbob4 beats drbob (slightly) for 21 values:

n = 100000;
test = Array[Round[20Random[]] &, n];
Timing[drbob[test];]
Timing[drbob2[test];]
Timing[drbob3[test];]
Timing[drbob4[test];]

{0.7970000000000255*Second,   Null}
{1.4530000000000314*Second,   Null}
{1.3900000000000432*Second,   Null}
{0.7810000000000059*Second,   Null}

and for binary lists:

n = 100000;
test = Array[Random[Integer] &, n];
Timing[drbob[test];]
Timing[drbob2[test];]
Timing[drbob3[test];]
Timing[drbob4[test];]
Timing[andrzej2[test];]

{0.8440000000000225*Second,   Null}
{1.421999999999997*Second,    Null}
{1.421999999999997*Second,    Null}
{0.7649999999999864*Second,   Null}
{0.5310000000000059*Second,   Null}

Bobby

On Mon, 31 Mar 2003 13:53:50 +0900, Andrzej Kozlowski <akoz at mimuw.edu.pl> 
wrote:

> I agree that Bob's algorithm is the simplest and most natural of all 
> except that his use of  FoldList seems to me to be intended to mystify 
> and confuse ;-)
>
> In fact
>
> drbob3[s_List] := Block[{count, n = 0},
> count[any_] := 0;
> Rest@Map[++n - (++count[#]) &,  s]
> ]
>
> does the same thing as drbob2 and is, at least on my machine, slightly 
> faster.
>
> As for the algorithm, it is essentially exactly what you would do if you 
> had to solve this problem by hand. You simply count how many times an 
> element in the n-th place has occurred and subtract it from n. The only 
> weakness of this algorithm is that when you get a repetition like 
> {b,a,a...} we know that the number of non a's before the second a is the 
> same as the number of non a's before the first a, so there is no need to 
> count but the algorithm wastes a tiny amount of time on recounting this 
> number again.
>
> My algorithm for two distinct values avoids this problem. It exploits the 
> fact that to solve the problem for just two distinct values you only need 
> to know:
>
> In[23]:=
> Length/@Split[{a,b,b,a,a,a,b,a,b,a,a}]
>
> Out[23]=
> {1,2,3,1,1,1,2}
>
> The numbers you see give you the number of repeats of the same value you 
> are going to see in the final list. You can also (here is the point where 
> the difference between two value case and the general case becomes 
> significant) work out what these values are going to be.  The number you 
> are going to get in the k-th position, where k is even will be the sum of 
> the numbers in all the odd positions preceding k, and the value you get 
> in the k-t place when k is odd the sum of all the values in the even 
> positions preceding k.
> Since my code depends on this fact it obviously can't be expected to work 
> with more than 2 values.
>
> Andrzej Kozlowski
> Yokohama, Japan
> http://www.mimuw.edu.pl/~akoz/
> http://platon.c.u-tokyo.ac.jp/andrzej/
>
>
> On Monday, March 31, 2003, at 10:14  am, Dr Bob wrote:
>
>> I actually think my code is the easiest to understand of them all, but I
>> suppose every programmer thinks that!
>>
>> I had forgotten to time my other code, so here it is:
>>
>> drbob2[s_List] := Block[{count, n = 0},
>> count[any_] := 0;
>> Rest@FoldList[++n - (++count[#2]) &, Null, s]
>> ]
>>
>> (Note that the second argument of FoldList, in both my solutions, is not
>> used.)
>>
>> I timed Selwyn's new solution, and the four best (for more than 2 
>> values)
>> are now:
>>
>> n = 10000;
>> test = Array[Round[99Random[]] &, n];
>> Timing[fisher[test];]
>> Timing[selwyn2[test];]
>> Timing[drbob[test];]
>> Timing[drbob2[test];]
>>
>> {2.171999999999997*Second,   Null}
>> {0.7180000000000035*Second,  Null}
>> {0.10999999999999943*Second,  Null}
>> {0.17199999999999704*Second,  Null}
>>
>> Bobby
>>
>> On Sun, 30 Mar 2003 11:38:29 -0500, Selwyn Hollis 
>> <selwynh at earthlink.net>
>> wrote:
>>
>>> For what it's worth, I get a 4X speed-up by not using PrependTo, and a
>>> minor improvement by using Block instead of Module within Scan:
>>>
>>> countdiffs[s_List] := Module[{members, totals, g, j},
>>> members = Union[s];
>>> totals = Count[s, #] & /@ members;
>>> j = Length[s];
>>> Scan[Block[{i = First@First@Position[members, #]},
>>> g[j] = j - totals[[i]];
>>> totals[[i]]--; j--] &, Reverse[s]];
>>> Table[g[i], {i, Length[s]}]]
>>>
>>> But it's still much slower than drBob's brilliantly incomprehensible 
>>> code
>>> :^)
>>>
>>> ---
>>> Selwyn
>>>
>>>
>>> On Saturday, March 29, 2003, at 08:13  PM, Dr Bob wrote:
>>>
>>>> Here are timings with a binary-valued list:
>>>>
>>>> n = 10000;
>>>> test = Array[Random[Integer] &, n];
>>>> Timing[gray[test];]
>>>> Timing[hanlon[test];]
>>>> Timing[andrzej1[test];]
>>>> Timing[fisher[test];]
>>>> Timing[selwyn[test];]
>>>> Timing[drbob[test];]
>>>> Timing[andrzej2[test];]
>>>>
>>>> {36.547*Second, Null}
>>>> {12.406000000000006*Second,   Null}
>>>> {19.437000000000012*Second,   Null}
>>>> {0.29700000000002547*Second,  Null}
>>>> {3.7819999999999823*Second,   Null}
>>>> {0.07800000000003138*Second,  Null}
>>>> {0.06200000000001182*Second,  Null}
>>>>
>>>> Neither 'gray' nor 'andrzej2' can handle more than two values in a 
>>>> list,
>>>> and they give different answers in that case, while the others agree.
>>>> (They all agree on binary lists.)
>>>>
>>>> n = 1000;
>>>> test = Array[Round[3*Random[]] &, n];
>>>> andrzej1[test] == fisher[test] == selwyn[test] == hanlon[test] ==
>>>> drbob[test]
>>>> gray[test] == drbob[test]
>>>> andrzej2[test] == drbob[test]
>>>> gray[test] == andrzej2[test]
>>>>
>>>> True
>>>> False
>>>> False
>>>> False
>>>>
>>>> Here are timings on a list with 21 different values:
>>>>
>>>> n = 10000;
>>>> test = Array[Round[20Random[]] &, n];
>>>> Union[test]
>>>> Timing[hanlon[test];]
>>>> Timing[andrzej1[test];]
>>>> Timing[fisher[test];]
>>>> Timing[selwyn[test];]
>>>> Timing[drbob[test];]
>>>>
>>>> {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
>>>> 20}
>>>> {13.078000000000001*Second,   Null}
>>>> {15.563000000000002*Second,   Null}
>>>> {0.5779999999999959*Second,   Null}
>>>> {2.9849999999999994*Second,   Null}
>>>> {0.07800000000000296*Second,  Null}
>>>>
>>>> and here are timings with 100 different values:
>>>>
>>>> n = 10000;
>>>> test = Array[Round[99Random[]] &, n];
>>>> Timing[hanlon[test];]
>>>> Timing[andrzej1[test];]
>>>> Timing[fisher[test];]
>>>> Timing[selwyn[test];]
>>>> Timing[drbob[test];]
>>>>
>>>> {12.969000000000001*Second,   Null}
>>>> {15.765*Second, Null}
>>>> {2.2039999999999935*Second,   Null}
>>>> {3.6400000000000006*Second,   Null}
>>>> {0.09399999999999409*Second,  Null}
>>>>
>>>> 'fisher' seems badly affected by increasing the number of values.
>>>>
>>>> Bobby
>>>>
>>>> On Sun, 30 Mar 2003 08:00:57 +0900, Andrzej Kozlowski
>>>> <akoz at mimuw.edu.pl> wrote:
>>>>
>>>>> If any one is interested in doing performance tests I suggest using 
>>>>> my
>>>>> other function, which I also posted to the Mathgroup. The two 
>>>>> functions
>>>>> were meant to demonstrate the difference between elegance (which I
>>>>> identify with shortness of code)  and efficiency. Here is the 
>>>>> efficient
>>>>> one (the algorithm works only for lists containing just two distinct
>>>>> symbols):
>>>>>
>>>>> f2[l_List] := Module[{mult =
>>>>> Length /@ Split[l], list1,
>>>>> list2, values}, list1 = Rest[FoldList[Plus, 0, Table[mult[[i]], {i, 
>>>>> 1,
>>>>> Length[mult],
>>>>> 2}]]]; list2 =
>>>>> FoldList[Plus,
>>>>> 0, Table[mult[[i]], {i, 2, 2(
>>>>> Floor[(Length[mult] + 1)/2]) - 1, 2}]];
>>>>> values = Take[Flatten[Transpose[{list2,
>>>>> list1}]], Length[mult]]; Flatten[Table[Table[values[[i]], {
>>>>> mult[[i]]}], {i, 1, Length[mult]}]]]
>>>>>
>>>>>
>>>>>
>>>>> On Sunday, March 30, 2003, at 07:12  am, Steve Gray wrote:
>>>>>
>>>>>> Hello all,
>>>>>>     I am again gratified for all the prompt, more-than-competent
>>>>>> solutions to my problem. I'm enclosing a .nb file with all replies
>>>>>> received so far so you can compare them. I did not do timing because
>>>>>> the lists I have are relatively short and I don't need to call this
>>>>>> function very often.
>>>>>>     Thanks again.
>>>>>>  
>>>>>> Steve Gray
>>>>>>
>>>>>> ----- Original Message -----
>>>>>> From: BobHanlon at aol.com
To: mathgroup at smc.vnet.net
>> To: mathgroup at smc.vnet.net
>>>>>> To: stevebg at adelphia.net
>>>>>> Sent: Saturday, March 29, 2003 4:53 AM
>>>>>> Subject: [mg40311] [mg40303] Re: [mg40279] Need a nice way to do this
>>>>>>
>>>>>> In a message dated 3/29/03 6:11:39 AM, stevebg at adelphia.net writes:
>>>>>>
>>>>>>     Given a list consisting of only two distinct values, such as
>>>>>> s={a,b,b,a,a,a,b,a,b,a,a}, I want to derive a list of equal length
>>>>>> g={0,1,1,2,2,2,4,3,5,4,4}. The rule is: for each position
>>>>>> 1<=p<=Length[s], look at list s and set g[[p]] to the number of
>>>>>> elements in s to the left of p which are not equal to s[[p]].
>>>>>>     In a more general version, which I do  not need now, s would
>>>>>> not be restricted to only two distinct values.
>>>>>>
>>>>>>     Thank you for any ideas, including other applications where
>>>>>> this particular calculation is used. The current application is an
>>>>>> unusual conjecture in geometry.
>>>>>>
>>>>>>
>>>>>> <Left-Count.nb>
>>>>> Andrzej Kozlowski
>>>>> Yokohama, Japan
>>>>> http://www.mimuw.edu.pl/~akoz/
>>>>> http://platon.c.u-tokyo.ac.jp/andrzej/
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> -- majort at cox-internet.com
>>>> Bobby R. Treat<Left-Count.nb>
>>>
>>>
>>
>>
>>
>
>



-- 
majort at cox-internet.com
Bobby R. Treat


  • Prev by Date: Re: Need a nice way to do this
  • Previous by thread: Re: Re: Need a nice way to do the "left count"
  • Next by thread: Help-directional fields