Re: Re: programming DeleteRepetitions

*To*: mathgroup at smc.vnet.net*Subject*: [mg23796] Re: [mg23733] Re: [mg23662] programming DeleteRepetitions*From*: "Allan Hayes" <hay at haystack.demon.co.uk>*Date*: Sat, 10 Jun 2000 02:59:49 -0400 (EDT)*References*: <B561C5D5.693C%andrzej@tuins.ac.jp>*Sender*: owner-wri-mathgroup at wolfram.com

Adrzej, Thanks for the reminder. I have found a slightly faster way of stopping the calculatiion when all possible values have been found. DRp1 below is the new code DRp2 is your code DRp3 is my code using Split DRp4 is Carl's original code test[r] gives the timings for 5 runs of these on a list of 10000 entries of random integers from 1 to r. test[2] {0.39, 0.38, 6.87, 1.2} test[10] {0.44, 0.49, 6.7, 1.87} test[100] {1.04, 1.7, 7.8, 2.86} test[500] {3.13, 6.38, 6.7, 3.62} test[1000] {6.04, 13.62, 6.81, 5.06} test[1200] {8.01, 19.17, 6.98, 5.99} test[1500] {8.95, 20.93, 7.03, 6.76} test[2000] {11.31, 23.01, 7.2, 8.68} test[5000] {25.71, 36.14, 8.51, 19.12} test[7500] {32.13, 39.66, 9.56, 22.02} test[10000] {32.14, 41.9, 9.07, 22.96} These times support your view that Carl's code would be faster than its elaborations when r large. However, for r = 10000 the Splitting method is much quicker. Here are the codes Remove["`*"]; DRp1[x_List] := Block[{i, j, c = 0, s = {}}, j[Length[Union[x]]] := Throw[s] ; i[n_] := (s = {s, i[n] = n}; j[++c]); Scan[i, x] ] // Catch // Flatten; DRp2[li_List] := Block[{i, counter = 0, l0 = {}, m = Length[Union[li]], Sequence}, i[n_] := (i[n] = Sequence[]; counter = counter + 1; n); Scan[If[counter < m, l0 = {l0, i[#]}, Return[Flatten[l0]]] &, li]; Flatten[l0]]; DRp3[x_List] := Last[Transpose[ Sort[Reverse /@ First /@ Split[Sort[ Transpose[{x, Range[Length[x]]}]], #1[[1]] == #2[[1]] &]]]]; DRp4[x_List] := Block[{i}, i[n_] := (i[n] = Sequence[]; n); i /@ x]; And the testing code funcs = ToExpression[Names["DR*"]] {DRp1, DRp2, DRp3, DRp4} test[r_] := With[{lst = Table[Random[Integer, {1, r}], {10000}]}, Timing[Do[#[lst], {5}]][[1]]/Second & /@ funcs] Regaards Allan --------------------- Allan Hayes Mathematica Training and Consulting Leicester UK www.haystack.demon.co.uk hay at haystack.demon.co.uk Voice: +44 (0)116 271 4198 Fax: +44 (0)870 164 0565 ----- Original Message ----- From: "Andrzej Kozlowski" <andrzej at tuins.ac.jp> To: mathgroup at smc.vnet.net <BobHanlon at aol.com> Subject: [mg23796] Re: [mg23733] Re: [mg23662] programming DeleteRepetitions > Allan, > > We discussed this issue in some detail almost exactly a year ago, when Carl > Woll came up with his OrderedUnion function. We then concluded that Carl's > function was the fastest when the number of different elements (i.e. > Length[Union[list]]) was relatively large compared with the total number of > elements (Length[list]), but was not so fast when there were only a few > distinct elements in a large list (e.g. when throwing a die a large number > of times). That is because in such a case Carl's function goes on looking > for repetitions long time after it found all the distinct elements. At that > time I wrote the following modification of Carl's function which uses Union > to find out how many distinct elements there are and then gets out once it > has found them all. > > > > DR0[li_List] := > Block[{i, counter = 0, l0 = {}, m = Length[Union[li]], Sequence}, > i[n_] := (i[n] = Sequence[]; counter = counter + 1; n); > Scan[If[counter < m, l0 = {l0, i[#]}, Return[Flatten[l0]]] &, li]; > Flatten[l0]] > > In cases when the number of distinct elements is small (e.g. throwing a die > 10000 times) in my tests it does better than all the functions you have > tested. (Actually results are different with Mathematica 3.0 and 4.0. They > are also rather different ona Mac (which I use) and under Windows. > > Actually, it may still be true that even in the case when the number of > distinct elements is small Carl's function is assymptotically the fastest. > This is because DR0 uses Union, which sorts the elements of the list from > which it is removing repetitions . I am not sure however whether Union first > sorts the elements and then removes repetitions or first removes repetitions > and then sorts them (the sescond approach would require something like > Carl's function). If the former is the case than Carl's function is > certainly assymptotically the fastest, even if the number of distinct > elements is small, but if the latter is true than it need not be so. > > > -- > Andrzej Kozlowski > Toyama International University, JAPAN > > For Mathematica related links and resources try: > <http://www.sstreams.com/Mathematica/> > > > > on 6/5/00 2:09 PM, Allan Hayes at hay at haystack.demon.co.uk wrote: > > > Bob, > > I have added to your codings and done more timing. > > > > The full list is {DR1, ...., DR9} > > > > DR1 is a new version using Splot > > DR2 is Carl Woll's code that I gave before (tidied up) > > DR3 is a faster rewrite of your DeleteRepetitions1 > > DR5 is your DeleteRepetitions1 > > The rest are your other examples reordered (your numbering is given against > > the code). > > > > In the timings: > > test[r, n] gives the timings for DR1, ...DRn on a list of 1000 random > > integers between 1 and r. > > > > DR6, your DeleteRepetitions6, is fastest up to r = 20. > > > > **TIMINGS** > > > > test[2, 9] > > {0.11, 0.06, 0., 0.22, 0.05, 0., 0.17, 0., 1.37} > > test[10, 8] > > {0.17, 0.05, 0.06, 0.22, 0.05, 0., 0.55, 0.} > > test[20, 8] > > {0.11, 0.05, 0.11, 0.22, 0.11, 0., 1.16, 0.16} > > test[30, 8] > > {0.1, 0.06, 0.11, 0.22, 0.11, 0.11, 1.49, 2.47} > > test[40, 8] > > {0.11, 0.05, 0.17, 0.38, 0.17, 0.16, 1.87, 4.28} > > test[60, 6] > > {0.11, 0.11, 0.22, 0.27, 0.22, 0.66} > > test[100, 6] > > {0.16, 0.06, 0.05, 0.49, 0.44, 0.61} > > test[200, 6] > > {0.11, 0.17, 0.11, 0.49, 0.94, 3.57} > > test[400, 6] > > {0.17, 0.27, 0.16, 0.94, 2.42, 4.99} > > test[600, 6] > > {0.16, 0.28, 0.22, 1.43, 3.79, 6.53} > > test[1000, 6] > > {0.16, 0.39, 0.33, 3.07, 5.93, 7.75} > > test[1000000000000, 6] (*almost certainly no repetitions*) > > {0.38, 0.55, 5.05, 5.28, 5.16, 5.33} > > > > > > ** CODE** > > > > Remove["`*"]; > > DR1[x_List] := > > Last[Transpose[ > > Sort[Reverse /@ > > First /@ > > Split[Sort[ > > Transpose[{x, > > Range[Length[x]]}]], #1[[1]] == #2[[1]] &]]]]; > > > > DF2[x_List] := Block[{i}, i[n_] := (i[n] = Sequence[]; n); > > i /@ x]; > > > > DF3[x_List] := x[[Sort[Flatten[First[Position[x, #]] & /@ Union[x]]]]]; > > > > (*2*)DF4[x_List] := > > Module[{uniq = {}}, > > If[Not[MemberQ[uniq, #]], (uniq = Append[uniq, #])] & /@ x; > > uniq]; > > (*1*)DF5[x_List] := > > Take[x, #] & /@ Sort[First /@ (Position[x, #] & /@ Union[x])] // > > Flatten; > > > > (*6*)DF6[x_List] := > > Module[{uniq = Union[x], n, portion}, n = Length[uniq]; > > While[(Union[portion = Take[x, n]]) != uniq, n++]; > > Take[portion, #] & /@ Sort[First /@ (Position[portion, #] & /@ uniq)] > > // > > Flatten]; > > > > (*2*)DF7[x_List] := > > Transpose[ > > Union[Transpose[Join[{Range[Length[x]]}, {x}]], > > SameTest -> (#1[[2]] == #2[[2]] &)]][[2]] > > > > (*5*)DF8[x_List] := > > Module[{uniq = Union[x], n, portion}, n = Length[uniq]; > > While[(Union[portion = Take[x, n]]) != uniq, n++]; > > portion //. {a___, b_, c___, b_, d___} -> {a, b, c, d}] > > > > (*4*)DF9[x_List] := x //. {a___, b_, c___, b_, d___} -> {a, b, c, d} > > > > **TEST CODE** > > > > funcs = ToExpression[Names["DF*"]] > > > > {DF1, DF2, DF3, DF4, DF5, DF6, DF7, DF8, DF9} > > > > SameQ @@ Through[funcs[Table[Random[Integer, {1, 10}], {100}]]] > > > > True > > > > test[r_, n_] := > > With[{lst = Table[Random[Integer, {1, r}], {1000}]}, > > Timing[#[lst]][[1]]/Second & /@ Take[funcs, n]] > > > > > > > > -- > > Allan > > --------------------- > > Allan Hayes > > Mathematica Training and Consulting > > Leicester UK > > www.haystack.demon.co.uk > > hay at haystack.demon.co.uk > > Voice: +44 (0)116 271 4198 > > Fax: +44 (0)870 164 0565 > > > > <BobHanlon at aol.com> wrote in message news:8gv88o$5n7 at smc.vnet.net... > >> > >> In a message dated 5/28/2000 11:37:53 PM, pnichols at wittenberg.edu writes: > >> > >>> Below I give a function which removes duplicates from a list (as Union > >>> does), but without sorting the result (as Union also does). More > >>> specifically, it extracts, in order, the first instance of each distinct > >>> element of the list. > >>> > >>> Is there any simpler way to do this? It's a simple idea, but I seem to > >>> need seven different list-manipulation functions, including 3 uses of > > Map! > >>> > >>> DeleteRepetitions[X_] := > >>> Take[X, #] & /@ > >>> Sort[First /@ > >>> (Position[X, #] & /@ > >>> Union[X])] // Flatten > >>> > >>> For example, > >>> > >>> In[2] := DeleteRepetitions[{3,1,2,3,3,2,4,1}] > >>> > >>> Out[2] = {3,1,2,4} > >>> > >>> In[3] := DeleteRepetitions[{b,a,b,a,c,a}] > >>> > >>> Out[3] = {b,a,c} > >>> > >>> I don't need to use this function on lists longer that 20 or so elements, > >>> so speed is not a critical concern. > >>> > >>> Also, my version happens to work on expressions with heads other than > > List > >>> (because Take, Position, Union, and Flatten all do so), but I don't > > really > >>> need that feature. > >>> > >>> How would you implement this function? > >>> > >> > >> DeleteRepetitions1[x_List] := > >> > >> Take[x, #] & /@ > >> > >> Sort[First /@ > >> (Position[x, #] & /@ > >> Union[x])] // > >> Flatten > >> > >> DeleteRepetitions2[x_List] := Module[{uniq = {}}, > >> If[Not[MemberQ[uniq, #]], (uniq = Append[uniq, #])] & /@ x; > >> uniq] > >> > >> DeleteRepetitions3[x_List] := > >> Transpose[ > >> Union[Transpose[Join[{Range[Length[x]]}, {x}]], > >> SameTest -> (#1[[2]] == #2[[2]] &)]][[2]] > >> > >> DeleteRepetitions4[x_List] := x //. {a___, b_, c___, b_, d___} -> {a, b, > > c, d} > >> > >> DeleteRepetitions5[x_List] := > >> Module[{uniq = Union[x], n, portion}, > >> n = Length[uniq]; While[(Union[portion = Take[x, n]]) != uniq, n++]; > >> portion //. {a___, b_, c___, b_, d___} -> {a, b, c, d}] > >> > >> DeleteRepetitions6[x_List] := > >> Module[{uniq = Union[x], n, portion}, > >> n = Length[uniq]; While[(Union[portion = Take[x, n]]) != uniq, n++]; > >> Take[portion, #] & /@ > >> Sort[First /@ > >> (Position[portion, #] & /@ > >> uniq)] // Flatten] > >> > >> testList = Table[Random[Integer, {1, 10}], {400}]; > >> > >> funcs = ToExpression[Names["DeleteRep*"]]; > >> > >> Demonstrating the equivalence of the functions > >> > >> Equal[Sequence @@ (#[testList] & /@ funcs)] > >> > >> True > >> > >> Comparing their timings > >> > >> Timing[#[testList]][[1]] & /@ funcs > >> > >> {0.01666666666665151*Second, 0.049999999999954525*Second, > >> 0.10000000000002274*Second, 2.2999999999999545*Second, > >> 0.016666666666765195*Second, 0.*Second} > >> > >> The fourth method is very slow for long lists. The fifth method speeds > > this > >> up provided that the list is highly redundant. The sixth method applies > > this > >> potential speed up to the first (original) method. > >> > >> Bob > >> > >> BobHanlon at aol.com > >> > > > > > > >

**Re: PLEASE HELP!! Need to change symbols in LinearLogListPlot**

**Re: Levenberg-Marquardt ?**

**Re: Re: programming DeleteRepetitions**

**Re: Re: programming DeleteRepetitions**