MathGroup Archive 2003

[Date Index] [Thread Index] [Author Index]

Search the Archive

RE: Is Sort stable?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg39742] RE: [mg39725] Is Sort stable?
  • From: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>
  • Date: Wed, 5 Mar 2003 00:04:53 -0500 (EST)
  • Sender: owner-wri-mathgroup at wolfram.com

>-----Original Message-----
>From: Roland Nilsson [mailto:rolle at ifm.liu.se]
To: mathgroup at smc.vnet.net
>Sent: Tuesday, March 04, 2003 5:50 AM
>To: mathgroup at smc.vnet.net
>Subject: [mg39742] [mg39725] Is Sort stable?
>
>
>Hi,
>
>--Short form:
>Is the sorting algorithm implemented by Sort[] stable?
>
>--Long form:
>I'm doing a thingy where I need to take out subsets of data 
>and look at them
>individually. I have a labeled data set, with labels being a 
>vector of 1,2
>... denoting class membership, and I've tried to pick out subset using
>either e.g.
>
>Extract[data, Position[labels,1]]
>=> the data points in "class 1"
>
>or using Sort to sort data according to labels (so I get e.g. 
>{1,1,1,1, ...
>2,2,2, ... 3,3,3}, and which is nice for plotting data sets. 
>This work ok,
>but it seems like Sort[] is rearranging the data points within classes
>different from Extract[]. Does anyone know i Is the Sort[] 
>algorithm stable?
>Could there be something else lurking here?
>
>   Regards,
>
>--
>---------------------------------------------
>Roland Nilsson
>IFM Computational Biology
>Linköping University, Sweden
>rolle at ifm.liu.se
>
>
>

Roland,

Let's compare several forms of Sort in Mathematica. First make up some test
data:
 
In[1]:=
s = Table[{Random[Integer, {1, 3}], Random[Integer, {1, 10}]}, {30}]
Out[1]=
{{2, 10}, {2, 2}, {1, 8}, {2, 4}, {2, 8}, {1, 4}, {2, 5}, {1, 1}, {1, 1}, 
 {2, 6}, {2, 3}, {3, 3}, {1, 5}, {3, 5}, {1, 9}, {3, 5}, {3, 2}, {1, 1},
 {3, 4}, {1, 8}, {2, 1}, {3, 4}, {2, 9}, {3, 10}, {1, 10}, {2, 5}, {3, 1},
 {1, 2}, {2, 4}, {2, 10}}

Let the first component be the label, and the second component be the data
(or payload).



In[2]:= ss1 = Sort[s]
Out[2]=
{{1, 1}, {1, 1}, {1, 1}, {1, 2}, {1, 4}, {1, 5}, {1, 8}, {1, 8}, {1, 9},
 {1, 10}, {2, 1}, {2, 2}, {2, 3}, {2, 4}, {2, 4}, {2, 5}, {2, 5}, {2, 6},
 {2, 8}, {2, 9}, {2, 10}, {2, 10}, {3, 1}, {3, 2}, {3, 3}, {3, 4}, {3, 4},
 {3, 5}, {3, 5}, {3, 10}}

Labels and payload are sorted into canonical order. The question of
stability doesn't arise.



Other forms of sort are possible, to better compare them, let's first make a
list of ranges for sorted labels:

In[3]:= ix = FoldList[Plus, 0, Length /@ Split @ Sort[s[[All, 1]]]];
In[4]:= rr = Transpose[{Drop[ix + 1, -1], Rest[ix]}]
Out[4]= {{1, 10}, {11, 22}, {23, 30}}

In[5]:= classes[sort_] := Flatten[Take[sort, #, {-1}]] & /@ rr

A function to get the classes.



This not the best, but a secure way to get a stable sort:

In[6]:=
ssstable = Join @@ (Cases[s, {#, _}] &) /@ Union[s[[All, 1]]]
Out[6]=
{{1, 8}, {1, 4}, {1, 1}, {1, 1}, {1, 5}, {1, 9}, {1, 1}, {1, 8}, {1, 10},
 {1, 2}, {2, 10}, {2, 2}, {2, 4}, {2, 8}, {2, 5}, {2, 6}, {2, 3}, {2, 1},
 {2, 9}, {2, 5}, {2, 4}, {2, 10}, {3, 3}, {3, 5}, {3, 5}, {3, 2}, {3, 4},
 {3, 4}, {3, 10}, {3, 1}}

In[7]:= stableclasses = classes[ssstable]
Out[7]=
{{8, 4, 1, 1, 5, 9, 1, 8, 10, 2},
 {10, 2, 4, 8, 5, 6, 3, 1, 9, 5, 4, 10},
 {3, 5, 5, 2, 4, 4, 10, 1}}

These are the stable classes, ss1 had them sorted too (which you didn't
like).


In[8]:= classes[ss1]
Out[8]=
{{1, 1, 1, 2, 4, 5, 8, 8, 9, 10},
 {1, 2, 3, 4, 4, 5, 5, 6, 8, 9, 10, 10},
 {1, 2, 3, 4, 4, 5, 5, 10}}



In Mathematica we may sort only according to the labels (the first component
here)

In[9]:= ss2 = Sort[s, First[#1] < First[#2] &];
In[10]:= classes[ss2]
Out[10]=
{{2, 10, 8, 1, 9, 5, 1, 1, 4, 8},
 {10, 4, 5, 9, 1, 3, 6, 5, 8, 4, 2, 10},
 {1, 10, 4, 4, 2, 5, 5, 3}}

In[11]:= % === stableclasses
Out[11]= False

But that also isn't stable.



Let's check the case when using the ordering of the labels only

In[12]:= ss3 = s[[Ordering[s[[All, 1]] ] ]];
In[13]:= classes[ss3]
Out[13]=
{{8, 1, 1, 10, 4, 8, 5, 2, 1, 9},
 {10, 1, 6, 9, 4, 4, 3, 5, 2, 10, 5, 8},
 {3, 5, 5, 10, 4, 1, 2, 4}}

In[14]:= % === stableclasses
Out[14]= False

In[15]:= %% === classes[ss2]
Out[15]= False



How do we get a performant stable sort?

An idea is to tag the payload with their ordering and use simple Sort:

In[16]:=
s4 = Transpose[{s[[All, 1]], Range[Length[s]], s[[All, -1]]}];

In[17]:= ss4 = Sort[s4][[All, {1, -1}]]
Out[17]= 
{{1, 8}, {1, 4}, {1, 1}, {1, 1}, {1, 5}, {1, 9}, {1, 1}, {1, 8}, {1, 10},
 {1, 2}, {2, 10}, {2, 2}, {2, 4}, {2, 8}, {2, 5}, {2, 6}, {2, 3}, {2, 1},
 {2, 9}, {2, 5}, {2, 4}, {2, 10}, {3, 3}, {3, 5}, {3, 5}, {3, 2}, {3, 4},
 {3, 4}, {3, 10}, {3, 1}}

In[18]:= classes[ss4]
Out[18]=
{{8, 4, 1, 1, 5, 9, 1, 8, 10, 2},
 {10, 2, 4, 8, 5, 6, 3, 1, 9, 5, 4, 10},
 {3, 5, 5, 2, 4, 4, 10, 1}}

In[19]:= % === stableclasses
Out[19]= True


--
Hartmut Wolf



  • Prev by Date: webMathematica trouble - solution
  • Next by Date: newbie graph question
  • Previous by thread: Re: Is Sort stable?
  • Next by thread: RE: Re: Is Sort stable?