MathGroup Archive 1999

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: easiest way to sort a list?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg18361] Re: [mg18308] easiest way to sort a list?
  • From: "Wolf, Hartmut" <hwolf at debis.com>
  • Date: Wed, 30 Jun 1999 14:13:27 -0400
  • Organization: debis Systemhaus
  • References: <199906260224.WAA14461@smc.vnet.net.>
  • Sender: owner-wri-mathgroup at wolfram.com

Hello Peter

Peter T. Wang schrieb:
> 
> Hi everyone,
> 
> Here's my question:  Suppose you have a list of integers, not all distinct,
> say
>       {1, 5, 3, 5, 10, 10, 1},
> 
> and you want to obtain the list
> 
>       {1, 5, 3, 10}
> 
> which is the set of distinct elements but with the order preserved; i.e.,
> the order in which the elements in the second list appear is the same as
> the order in which it first occurs in the first list.  What is a simple
> way to do this?  Evidently Union[] sorts the elements in canonical order,
> which is not desirable.  My current solution is so messy that I suspect
> there must be an easier way.
> 

To my mind came different algorithms

  unify[l_]:=Extract[l,Sort[First/@( Position[l,#]&/@Union[l])]]

For each element get the list of all positions, where that element
occurs, and then extract it in order according to its first occurrence. 

  unify3[l_]:=
    With[{f=Function[If[MemberQ[#1,#2],#1,Append[#1,#2]]]},
Fold[f,{},l]]

The helper function f accumulates the current element to the output only
if it hasn't yet occurred.

  unify2[l_]:= 
    Transpose[{l,Range[Length[l]]}]//
      Union[#,SameTest->(First[#1]===First[#2]&)]&//
      Sort[#,Last[#1]<Last[#2]&]&//Transpose//First

Every element ist tagged by its position in the list, then unified
(without regard of the tag), which keeps its first occurrence. This then
is brought into original order by the tags, which are stripped of
thereafter.

  unify4[l_]:=
  Module[{f,u},With[{d=Length[l]+1},MapIndexed[(f[#1]=d-#2 )&,
Reverse[l]] ];
    u=Extract[#,{{1,1,1},{2}}]&/@DownValues[f];
    First/@ Sort[u,#1[[2,1]]<#2[[2,1]]&]]

This looks queer, but we simplly use Mathematica's symbol store to
associate an element to it's own position in the list. Travelling
backwards keeps the first occurance. Extracting from DownValues and
Sorting gives back original ordering.

All algorithms are good at short lists of several hundred elements, but
unify and unify3 become bad increasingly worse for longer lists. We may
gess for a time complexity of O[len^2] for unify and unify3, when len is
the length of the input. Best is unify2 which takes about half of the
time of unify4, which both might be O[len Log[len]]. Measurements for
lists of length 100 ... 8000, when we have about 1/5 of mutliple
occurrances seem to confirm that.

Interestingly

  unify3b[l_]:=
  With[{f=Function[If[MemberQ[#1,#2,Infinity],#1,{#1,#2}]]},
Fold[f,{},l]]//
    Flatten

when trying to avoid Append makes that function much worse, because of
the increasing effort to memberQ.

For timing I used lists like

  l8000=With[{m=8000},Table[Random[Integer,{1,0.75 m}],{ m}]];

For that one I got
  {84.131,48.149,5.769,9.684} Seconds for {unify, unify3, unify2,
unify4}

(unify3b took 106.773 seconds for that one)

regards, hw



  • Prev by Date: Re: O.D.E in Power Series
  • Next by Date: Re: writing mathematica data to a tab delimited file??
  • Previous by thread: Re: easiest way to sort a list?
  • Next by thread: Re: easiest way to sort a list?