Re: easiest way to sort a list?
- To: mathgroup at smc.vnet.net
- Subject: [mg18344] Re: [mg18308] easiest way to sort a list?
- From: "Carl K.Woll" <carlw at fermi.phys.washington.edu>
- Date: Wed, 30 Jun 1999 14:13:17 -0400
- Organization: Department of Physics
- References: <199906260224.WAA14461@smc.vnet.net.>
- Sender: owner-wri-mathgroup at wolfram.com
Hi all, There have been a number of proposed solutions to the problem of removing duplicates without changing the order of elements. I thought I would provide a brief summary and comparison. In the following list of solutions, I have added another version, which I mentioned in my post but didn't include. Here are the proposed solutions (and authors): Carl Woll: OrderedUnion[1][li_]:=Block[{i}, i[n_]:=(i[n]=Sequence[];n); i /@ li] Robert Villegas (courtesy of Ted Ersek) RestoreOrder[subset_, li_] := Last /@ Sort[ {Position[li, #, {1}, 1][[1,1]], #}& /@ subset ] ; OrderedUnion[2][li_List] := RestoreOrder[Union[li], li] Arnold Knopfmacher OrderedUnion[3][li_]:= Fold[Flatten[If[!MemberQ[#1, #2], Append[#1, #2], #1]] &, {}, li] Carl Woll OrderedUnion[4][li_]:=Module[{tmp}, tmp=Transpose[{li,Range[Length[li]]}]; First[ Transpose[ Sort[ Union[tmp,SameTest->(#1[[1]]===#2[[1]]&)], #1[[2]]<#2[[2]]&] ] ] ] Andrzej Kozlowski OrderedUnion[5][l_]:= Sort[Union[l],Position[l,#1,{1}][[1,1]]<Position[l,#2,{1}][[1,1]]&] Several people OrderedUnion[6][li_]:=li//.{a___,b_,c___,b_,d___}->{a,b,c,d} Now for the results. In[289]:= li=Table[Random[Integer,100],{200}]; Do[t[i]=OrderedUnion[i][li]//Timing,{i,6}] SameQ@@Last/@t/@Range[6] First/@t/@Range[6] Out[290]= True Out[291]= {0.11 Second,0.16 Second,0.17 Second,0.27 Second,1.71 Second,63.32 Second} We see that functions 5 and 6 are clearly not the most efficient. The first 4 functions are roughly equivalent. If we increase the number of elements by an order of magnitude we get In[266]:= li=Table[Random[Integer,1000],{2000}]; Do[t[i]=OrderedUnion[i][li]//Timing,{i,4}] SameQ@@Last/@t/@Range[4] First/@t/@Range[4] Out[267]= True Out[268]= {0.98 Second,6.38 Second,11.86 Second,3.63 Second} Now we see that function 1 is the most efficient. Increasing the number of elements by another order of magnitude, we get In[272]:= li=Table[Random[Integer,10000],{20000}]; Do[t[i]=OrderedUnion[i][li]//Timing,{i,4}] SameQ@@Last/@t/@Range[4] First/@t/@Range[4] Out[273]= True Out[274]= {11.54 Second,581.93 Second,1097.52 Second,45.59 Second} Now, functions 1 and 4 seem to be scaling as O(n log n), where as functions 2 and 3 are scaling as O(n^2). I stated before that function 1 scales as O(n), but I believe that it really scales as O(n log n). The cost of creating the i function and accessing its information probably scales as O(n log n). At any rate, OrderedUnion[1] seems to be the most efficient method. One comment on function 3, which uses Append. In general, repeatedly appending to a list should be avoided, as in Mathematica it is a very slow operation. Instead, the preferred method is to create a new list as in list = {list, new} and after all the elements have been added, then flatten the list. However, this doesn't seem to help here. Peter T. Wang wrote: > Hi everyone, > > Here's my question: Suppose you have a list of integers, not all distinct, > say > {1, 5, 3, 5, 10, 10, 1}, > > and you want to obtain the list > > {1, 5, 3, 10} > > which is the set of distinct elements but with the order preserved; i.e., > the order in which the elements in the second list appear is the same as > the order in which it first occurs in the first list. What is a simple > way to do this? Evidently Union[] sorts the elements in canonical order, > which is not desirable. My current solution is so messy that I suspect > there must be an easier way. > > -peter
- Follow-Ups:
- Re: Re: easiest way to sort a list?
- From: "Wolf, Hartmut" <hwolf@debis.com>
- Re: Re: easiest way to sort a list?