Re: Re: RE: Re: Re: easiest way to sort a list?

• To: mathgroup at smc.vnet.net
• Subject: [mg18589] Re: [mg18556] Re: [mg18525] RE: [mg18428] Re: [mg18344] Re: [mg18308] easiest way to sort a list?
• From: "Carl K.Woll" <carlw at fermi.phys.washington.edu>
• Date: Tue, 13 Jul 1999 01:01:24 -0400
• Organization: Department of Physics
• References: <199907100618.CAA02996@smc.vnet.net.>
• Sender: owner-wri-mathgroup at wolfram.com

```Andrzej,

Your results are very different from those I get with Mathematica V3.0 on a 100 MHz PC
with Windows 95. For your l2 list, I got

{{1.93 Second, Null}, {6.04 Second, Null},
{43.12 Second, Null}, {5.65 Second, Null}}

If I try using the following list

Table[Random[Integer, {1, 5000}], {10000}];

I get the following timings

{{7.47 Second, Null}, {14.4 Second, Null},
{177.24 Second, Null}, {6.98 Second, Null}}

So, for some reason OrderedUnion[4] is much slower on your machine than it is on mine.
That is, for the list l2, OrderedUnion[4] is 66 times slower than OrderedUnion[1] on your
machine, but it is only 3 times slower on my machine. This is odd, since OrderedUnion[4]
uses Sort and Union, and I thought Colin Rose suggested that Sort and Union are much
faster in Version 4.0 than in Version 3.0. Hence, I would have thought that
OrderedUnion[4] would have worked even better on your machine.

Carl Woll
Dept of Physics
U of Washington

Andrzej Kozlowski wrote:

> Here are some more results on my PowerBook G3 (233 mghz, Mathematica 4.0):
>
> In[1]:=
> OrderedUnion[1][li_] :=
>   Block[{i, Sequence},
>                   i[n_] := (i[n] = Sequence[]; n);
>                   i /@ li]
>
> OrderedUnion[2][li_] :=
>   Block[{i, counter = 0, l0 = {}, m = Length[Union[li]], Sequence},
>     i[n_] := (i[n] = Sequence[]; counter = counter + 1; n);
>    Scan[If[counter < m, l0 = {l0, i[#]}, Return[Flatten[l0]]] &, li];
>     Flatten[l0]]
>
> OrderedUnion[3][h_[args__]] :=
>   Fold[ If[ FreeQ[#1, #2, 1], Join[#1, h[#2]], #1] & , h[], {args}]
>
> OrderedUnion[4][li_] :=
>   Module[{tmp = Transpose[{li, Range[Length[li]]}]},
>     tmp = Union[tmp, SameTest -> (#1[[1]] === #2[[1]] &)];
>     Last /@ Sort[RotateLeft /@ tmp]]
>
> First a long list with a small number of distinnct elements:
>
> In[5]:=
> l1 = Table[Random[Integer, {1, 10}], {10000}];
>
> In[6]:=
> Table[OrderedUnion[i][l1]; // Timing, {i, 1, 4}]
> Out[6]=
> {{0.183333 Second, Null}, {0.0333333 Second, Null},
>
>   {0.75 Second, Null}, {0.533333 Second, Null}}
>
> Next a long list with quite many distinct elements:
>
> l2 = Table[Random[Integer, {1, 1000}], {10000}];
>
> In[8]:=
> Table[OrderedUnion[i][l2]; // Timing, {i, 1, 4}]
> Out[8]=
> {{0.3 Second, Null}, {1.05 Second, Null},
>
>   {8.48333 Second, Null}, {22.0333 Second, Null}}
>
> --
> Andrzej Kozlowski
> Toyama International University
> JAPAN
> http://sigma.tuins.ac.jp
> http://eri2.tuins.ac.jp
>
> ----------
> >From: "Ersek, Ted R" <ErsekTR at navair.navy.mil>
> To: mathgroup at smc.vnet.net
> >To: mathgroup at smc.vnet.net
> >Subject: [mg18589] [mg18556] [mg18525] RE: [mg18428] Re: [mg18344] Re: [mg18308] easiest way to
> sort a list?
> >Date: Fri, Jul 9, 1999, 11:32 AM
> >
>
> > I have a new approach to add to this long thread.
> >
> > In[1]:=
> > OrderedUnion5[h_[args__]]:=
> >   Fold[ If[ FreeQ[#1,#2,1], Join[#1,h[#2]], #1]& ,h[],{args}]
> >
> > In[2]:=
> > OrderedUnion5[{5,3,5,3,1,2,2,3,4,5,3}]
> >
> > Out[2]=
> > {5,3,1,2,4}
> >
> >
> > I would like to see how the timings of different methods compare (version
> > above included).  I would do it, but I am working under Windows 98, and I
> > understand Microsoft made it impossible for Timing to give accurate results.
> >
> > Regards,
> > Ted Ersek
> >
> > --------------------
> > Hi Hartmut (and newsgroup),
> >
> > Thanks for unraveling the linear nature of the Block[{i... function. Nice
> > explanation!
> >
> > Since this topic shows no sign of slowing down, I thought I would present a
> > couple refined versions of my two functions. First, an improvement of the
> > "linear" solution:
> >
> > OrderedUnion[li_]:=Block[{i,Sequence},
> >     i[n_]:=(i[n]=Sequence[];n);
> >     i/@li]
> >
> > By including Sequence in the Block variables, Mathematica doesn't waste time
> > eliminating Sequences until after the Block is exited. This seems to be a
> > 5-10% or so improvement in speed.
> >
> > Secondly, an improvement of my O(n log n) solution:
> >
> > OrderedUnion[3][li_]:=Module[{tmp=Transpose[{li,Range[Length[li]]}]},
> >   tmp=Union[tmp, SameTest->(#1[[1]]===#2[[1]]&)];
> >   Last/@Sort[RotateLeft/@tmp]]
> >
> > The key idea here is that Sort with the default ordering function is much
> > faster than Sort with a user supplied ordering function. So, RotateLeft
> > switches the order of the index and the element to allow Sort to be used
> > with its default ordering function. For lists with a large number of
> > distinct elements, this seems to improve the speed by a factor of 3.
> >
> > Carl Woll
> > Dept of Physics
> > U of Washington
> >
> > Wolf, Hartmut wrote:
> >
> >> Hello everone,
> >>
> >> I just sent a letter to Carl K.Woll, part of which I would like to
> >> publish to the community:
> >>
> >> Hello Carl,
> >>
> >> congratulations for your brilliant solution of "easiest way to sort a
> >> list?" At the first glance, ... I was already struck.
> >>
> >> According to my measurements, you proposed not only the fastest
> >> solution, but also the second fastest, the fastest O[n log n] version.
> >> ....
> >>
> >> The question that concerned me was, is your solution really O[n]? My
> >> preoccupation before was that O[n log n] could be the best achievable.
> >> Clearly your solution is O[n] for 'smaller' samples, but the question
> >> was, is it still O[n] for very large samples?  So I did some
> >> measurements, which I'd like to show you. They are in the two attached
> >> notebooks (...not for you, unless you request it from
> >> mailto:hwolf at debis.com )
> >>
> >> What is the interpretation of the graphs? In fact your solution is as
> >> far as I could calculate seemingly linear overall. There are interesting
> >> wiggles in it. The algorithm seems to become increasingly worse, when it
> >> makes up it's mind and "starts anew". Well I think this are the traces
> >> of Mathematica's Extendible Hashing algorithm for the symbol store. When
> >> hashing becomes more and more inefficient (due to collisions) then the
> >> store ist extended. (This conforms to my observation of memory usage
> >> with Windows NT Task Manager during the calculation, where I say periods
> >> of rapid increase of used memory and times of slower grow.) So it's this
> >> extension, which keeps your algorithm linear.
> >>
> >> ....
> >>
> >> With kind regards, your
> >>         Hartmut Wolf
> >

```

• Prev by Date: Re: Re: Simplifying constants...bug?
• Next by Date: POLE-ZERO PLOTS
• Previous by thread: Re: RE: Re: Re: easiest way to sort a list?
• Next by thread: Re: Re: RE: Re: Re: easiest way to sort a list?