Re: easiest way to sort a list?

• To: mathgroup at smc.vnet.net
• Subject: [mg18344] Re: [mg18308] easiest way to sort a list?
• From: "Carl K.Woll" <carlw at fermi.phys.washington.edu>
• Date: Wed, 30 Jun 1999 14:13:17 -0400
• Organization: Department of Physics
• References: <199906260224.WAA14461@smc.vnet.net.>
• Sender: owner-wri-mathgroup at wolfram.com

Hi all,

There have been a number of proposed solutions to the problem of removing
duplicates without changing the order of elements. I thought I would provide a
brief summary and comparison. In the following list of solutions, I have added
another version, which I mentioned in my post but didn't include.

Here are the proposed solutions (and authors):

Carl Woll:
OrderedUnion[1][li_]:=Block[{i},
i[n_]:=(i[n]=Sequence[];n);
i /@ li]

Robert Villegas (courtesy of Ted Ersek)
RestoreOrder[subset_, li_] :=
Last /@ Sort[ {Position[li, #, {1}, 1][[1,1]], #}& /@ subset ] ;
OrderedUnion[2][li_List] := RestoreOrder[Union[li], li]

Arnold Knopfmacher
OrderedUnion[3][li_]:=
Fold[Flatten[If[!MemberQ[#1, #2], Append[#1, #2], #1]] &, {}, li]

Carl Woll
OrderedUnion[4][li_]:=Module[{tmp},
tmp=Transpose[{li,Range[Length[li]]}];
First[
Transpose[
Sort[
Union[tmp,SameTest->(#1[[1]]===#2[[1]]&)],
#1[[2]]<#2[[2]]&]
]
]
]

Andrzej Kozlowski
OrderedUnion[5][l_]:=
Sort[Union[l],Position[l,#1,{1}][[1,1]]<Position[l,#2,{1}][[1,1]]&]

Several people
OrderedUnion[6][li_]:=li//.{a___,b_,c___,b_,d___}->{a,b,c,d}

Now for the results.

In[289]:=
li=Table[Random[Integer,100],{200}];
Do[t[i]=OrderedUnion[i][li]//Timing,{i,6}]
SameQ@@Last/@t/@Range[6]
First/@t/@Range[6]
Out[290]=
True
Out[291]=
{0.11 Second,0.16 Second,0.17 Second,0.27 Second,1.71 Second,63.32 Second}

We see that functions 5 and 6 are clearly not the most efficient. The first 4
functions are roughly equivalent.

If we increase the number of elements by an order of magnitude we get

In[266]:=
li=Table[Random[Integer,1000],{2000}];
Do[t[i]=OrderedUnion[i][li]//Timing,{i,4}]
SameQ@@Last/@t/@Range[4]
First/@t/@Range[4]
Out[267]=
True
Out[268]=
{0.98 Second,6.38 Second,11.86 Second,3.63 Second}

Now we see that function 1 is the most efficient. Increasing the number of
elements by another order of magnitude, we get

In[272]:=
li=Table[Random[Integer,10000],{20000}];
Do[t[i]=OrderedUnion[i][li]//Timing,{i,4}]
SameQ@@Last/@t/@Range[4]
First/@t/@Range[4]
Out[273]=
True
Out[274]=
{11.54 Second,581.93 Second,1097.52 Second,45.59 Second}

Now, functions 1 and 4 seem to be scaling as O(n log n), where as functions 2
and 3 are scaling as O(n^2). I stated before that function 1 scales as O(n),
but I believe that it really scales as O(n log n). The cost of creating the i
function and accessing its information probably scales as O(n log n). At any
rate, OrderedUnion[1] seems to be the most efficient method.

One comment on function 3, which uses Append. In general, repeatedly appending
to a list should be avoided, as in Mathematica it is a very slow operation.
Instead, the preferred method is to create a new list as in

list = {list, new}

and after all the elements have been added, then flatten the list. However,
this doesn't seem to help here.

Peter T. Wang wrote:

> Hi everyone,
>
> Here's my question:  Suppose you have a list of integers, not all distinct,
> say
>       {1, 5, 3, 5, 10, 10, 1},
>
> and you want to obtain the list
>
>       {1, 5, 3, 10}
>
> which is the set of distinct elements but with the order preserved; i.e.,
> the order in which the elements in the second list appear is the same as
> the order in which it first occurs in the first list.  What is a simple
> way to do this?  Evidently Union[] sorts the elements in canonical order,
> which is not desirable.  My current solution is so messy that I suspect
> there must be an easier way.
>
> -peter

• Prev by Date: Re: TeXForm with long expressions
• Next by Date: Re: RE: easiest way to sort a list?
• Previous by thread: Re: TeXForm with long expressions
• Next by thread: Re: Re: easiest way to sort a list?