Re: easiest way to sort a list?
- To: mathgroup at smc.vnet.net
- Subject: [mg18344] Re: [mg18308] easiest way to sort a list?
- From: "Carl K.Woll" <carlw at fermi.phys.washington.edu>
- Date: Wed, 30 Jun 1999 14:13:17 -0400
- Organization: Department of Physics
- References: <199906260224.WAA14461@smc.vnet.net.>
- Sender: owner-wri-mathgroup at wolfram.com
Hi all,
There have been a number of proposed solutions to the problem of removing
duplicates without changing the order of elements. I thought I would provide a
brief summary and comparison. In the following list of solutions, I have added
another version, which I mentioned in my post but didn't include.
Here are the proposed solutions (and authors):
Carl Woll:
OrderedUnion[1][li_]:=Block[{i},
i[n_]:=(i[n]=Sequence[];n);
i /@ li]
Robert Villegas (courtesy of Ted Ersek)
RestoreOrder[subset_, li_] :=
Last /@ Sort[ {Position[li, #, {1}, 1][[1,1]], #}& /@ subset ] ;
OrderedUnion[2][li_List] := RestoreOrder[Union[li], li]
Arnold Knopfmacher
OrderedUnion[3][li_]:=
Fold[Flatten[If[!MemberQ[#1, #2], Append[#1, #2], #1]] &, {}, li]
Carl Woll
OrderedUnion[4][li_]:=Module[{tmp},
tmp=Transpose[{li,Range[Length[li]]}];
First[
Transpose[
Sort[
Union[tmp,SameTest->(#1[[1]]===#2[[1]]&)],
#1[[2]]<#2[[2]]&]
]
]
]
Andrzej Kozlowski
OrderedUnion[5][l_]:=
Sort[Union[l],Position[l,#1,{1}][[1,1]]<Position[l,#2,{1}][[1,1]]&]
Several people
OrderedUnion[6][li_]:=li//.{a___,b_,c___,b_,d___}->{a,b,c,d}
Now for the results.
In[289]:=
li=Table[Random[Integer,100],{200}];
Do[t[i]=OrderedUnion[i][li]//Timing,{i,6}]
SameQ@@Last/@t/@Range[6]
First/@t/@Range[6]
Out[290]=
True
Out[291]=
{0.11 Second,0.16 Second,0.17 Second,0.27 Second,1.71 Second,63.32 Second}
We see that functions 5 and 6 are clearly not the most efficient. The first 4
functions are roughly equivalent.
If we increase the number of elements by an order of magnitude we get
In[266]:=
li=Table[Random[Integer,1000],{2000}];
Do[t[i]=OrderedUnion[i][li]//Timing,{i,4}]
SameQ@@Last/@t/@Range[4]
First/@t/@Range[4]
Out[267]=
True
Out[268]=
{0.98 Second,6.38 Second,11.86 Second,3.63 Second}
Now we see that function 1 is the most efficient. Increasing the number of
elements by another order of magnitude, we get
In[272]:=
li=Table[Random[Integer,10000],{20000}];
Do[t[i]=OrderedUnion[i][li]//Timing,{i,4}]
SameQ@@Last/@t/@Range[4]
First/@t/@Range[4]
Out[273]=
True
Out[274]=
{11.54 Second,581.93 Second,1097.52 Second,45.59 Second}
Now, functions 1 and 4 seem to be scaling as O(n log n), where as functions 2
and 3 are scaling as O(n^2). I stated before that function 1 scales as O(n),
but I believe that it really scales as O(n log n). The cost of creating the i
function and accessing its information probably scales as O(n log n). At any
rate, OrderedUnion[1] seems to be the most efficient method.
One comment on function 3, which uses Append. In general, repeatedly appending
to a list should be avoided, as in Mathematica it is a very slow operation.
Instead, the preferred method is to create a new list as in
list = {list, new}
and after all the elements have been added, then flatten the list. However,
this doesn't seem to help here.
Peter T. Wang wrote:
> Hi everyone,
>
> Here's my question: Suppose you have a list of integers, not all distinct,
> say
> {1, 5, 3, 5, 10, 10, 1},
>
> and you want to obtain the list
>
> {1, 5, 3, 10}
>
> which is the set of distinct elements but with the order preserved; i.e.,
> the order in which the elements in the second list appear is the same as
> the order in which it first occurs in the first list. What is a simple
> way to do this? Evidently Union[] sorts the elements in canonical order,
> which is not desirable. My current solution is so messy that I suspect
> there must be an easier way.
>
> -peter
- Follow-Ups:
- Re: Re: easiest way to sort a list?
- From: "Wolf, Hartmut" <hwolf@debis.com>
- Re: Re: easiest way to sort a list?