       Re: easiest way to sort a list?

• To: mathgroup at smc.vnet.net
• Subject: [mg18344] Re: [mg18308] easiest way to sort a list?
• From: "Carl K.Woll" <carlw at fermi.phys.washington.edu>
• Date: Wed, 30 Jun 1999 14:13:17 -0400
• Organization: Department of Physics
• References: <199906260224.WAA14461@smc.vnet.net.>
• Sender: owner-wri-mathgroup at wolfram.com

```Hi all,

There have been a number of proposed solutions to the problem of removing
duplicates without changing the order of elements. I thought I would provide a
brief summary and comparison. In the following list of solutions, I have added
another version, which I mentioned in my post but didn't include.

Here are the proposed solutions (and authors):

Carl Woll:
OrderedUnion[li_]:=Block[{i},
i[n_]:=(i[n]=Sequence[];n);
i /@ li]

Robert Villegas (courtesy of Ted Ersek)
RestoreOrder[subset_, li_] :=
Last /@ Sort[ {Position[li, #, {1}, 1][[1,1]], #}& /@ subset ] ;
OrderedUnion[li_List] := RestoreOrder[Union[li], li]

Arnold Knopfmacher
OrderedUnion[li_]:=
Fold[Flatten[If[!MemberQ[#1, #2], Append[#1, #2], #1]] &, {}, li]

Carl Woll
OrderedUnion[li_]:=Module[{tmp},
tmp=Transpose[{li,Range[Length[li]]}];
First[
Transpose[
Sort[
Union[tmp,SameTest->(#1[]===#2[]&)],
#1[]<#2[]&]
]
]
]

Andrzej Kozlowski
OrderedUnion[l_]:=
Sort[Union[l],Position[l,#1,{1}][[1,1]]<Position[l,#2,{1}][[1,1]]&]

Several people
OrderedUnion[li_]:=li//.{a___,b_,c___,b_,d___}->{a,b,c,d}

Now for the results.

In:=
li=Table[Random[Integer,100],{200}];
Do[t[i]=OrderedUnion[i][li]//Timing,{i,6}]
SameQ@@Last/@t/@Range
First/@t/@Range
Out=
True
Out=
{0.11 Second,0.16 Second,0.17 Second,0.27 Second,1.71 Second,63.32 Second}

We see that functions 5 and 6 are clearly not the most efficient. The first 4
functions are roughly equivalent.

If we increase the number of elements by an order of magnitude we get

In:=
li=Table[Random[Integer,1000],{2000}];
Do[t[i]=OrderedUnion[i][li]//Timing,{i,4}]
SameQ@@Last/@t/@Range
First/@t/@Range
Out=
True
Out=
{0.98 Second,6.38 Second,11.86 Second,3.63 Second}

Now we see that function 1 is the most efficient. Increasing the number of
elements by another order of magnitude, we get

In:=
li=Table[Random[Integer,10000],{20000}];
Do[t[i]=OrderedUnion[i][li]//Timing,{i,4}]
SameQ@@Last/@t/@Range
First/@t/@Range
Out=
True
Out=
{11.54 Second,581.93 Second,1097.52 Second,45.59 Second}

Now, functions 1 and 4 seem to be scaling as O(n log n), where as functions 2
and 3 are scaling as O(n^2). I stated before that function 1 scales as O(n),
but I believe that it really scales as O(n log n). The cost of creating the i
function and accessing its information probably scales as O(n log n). At any
rate, OrderedUnion seems to be the most efficient method.

One comment on function 3, which uses Append. In general, repeatedly appending
to a list should be avoided, as in Mathematica it is a very slow operation.
Instead, the preferred method is to create a new list as in

list = {list, new}

and after all the elements have been added, then flatten the list. However,
this doesn't seem to help here.

Peter T. Wang wrote:

> Hi everyone,
>
> Here's my question:  Suppose you have a list of integers, not all distinct,
> say
>       {1, 5, 3, 5, 10, 10, 1},
>
> and you want to obtain the list
>
>       {1, 5, 3, 10}
>
> which is the set of distinct elements but with the order preserved; i.e.,
> the order in which the elements in the second list appear is the same as
> the order in which it first occurs in the first list.  What is a simple
> way to do this?  Evidently Union[] sorts the elements in canonical order,
> which is not desirable.  My current solution is so messy that I suspect
> there must be an easier way.
>
> -peter

```

• Prev by Date: Re: TeXForm with long expressions
• Next by Date: Re: RE: easiest way to sort a list?
• Previous by thread: Re: TeXForm with long expressions
• Next by thread: Re: Re: easiest way to sort a list?