[Date Index]
[Thread Index]
[Author Index]
RE: Is Sort stable?
*To*: mathgroup at smc.vnet.net
*Subject*: [mg39742] RE: [mg39725] Is Sort stable?
*From*: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>
*Date*: Wed, 5 Mar 2003 00:04:53 -0500 (EST)
*Sender*: owner-wri-mathgroup at wolfram.com
>-----Original Message-----
>From: Roland Nilsson [mailto:rolle at ifm.liu.se]
To: mathgroup at smc.vnet.net
>Sent: Tuesday, March 04, 2003 5:50 AM
>To: mathgroup at smc.vnet.net
>Subject: [mg39742] [mg39725] Is Sort stable?
>
>
>Hi,
>
>--Short form:
>Is the sorting algorithm implemented by Sort[] stable?
>
>--Long form:
>I'm doing a thingy where I need to take out subsets of data
>and look at them
>individually. I have a labeled data set, with labels being a
>vector of 1,2
>... denoting class membership, and I've tried to pick out subset using
>either e.g.
>
>Extract[data, Position[labels,1]]
>=> the data points in "class 1"
>
>or using Sort to sort data according to labels (so I get e.g.
>{1,1,1,1, ...
>2,2,2, ... 3,3,3}, and which is nice for plotting data sets.
>This work ok,
>but it seems like Sort[] is rearranging the data points within classes
>different from Extract[]. Does anyone know i Is the Sort[]
>algorithm stable?
>Could there be something else lurking here?
>
> Regards,
>
>--
>---------------------------------------------
>Roland Nilsson
>IFM Computational Biology
>Linköping University, Sweden
>rolle at ifm.liu.se
>
>
>
Roland,
Let's compare several forms of Sort in Mathematica. First make up some test
data:
In[1]:=
s = Table[{Random[Integer, {1, 3}], Random[Integer, {1, 10}]}, {30}]
Out[1]=
{{2, 10}, {2, 2}, {1, 8}, {2, 4}, {2, 8}, {1, 4}, {2, 5}, {1, 1}, {1, 1},
{2, 6}, {2, 3}, {3, 3}, {1, 5}, {3, 5}, {1, 9}, {3, 5}, {3, 2}, {1, 1},
{3, 4}, {1, 8}, {2, 1}, {3, 4}, {2, 9}, {3, 10}, {1, 10}, {2, 5}, {3, 1},
{1, 2}, {2, 4}, {2, 10}}
Let the first component be the label, and the second component be the data
(or payload).
In[2]:= ss1 = Sort[s]
Out[2]=
{{1, 1}, {1, 1}, {1, 1}, {1, 2}, {1, 4}, {1, 5}, {1, 8}, {1, 8}, {1, 9},
{1, 10}, {2, 1}, {2, 2}, {2, 3}, {2, 4}, {2, 4}, {2, 5}, {2, 5}, {2, 6},
{2, 8}, {2, 9}, {2, 10}, {2, 10}, {3, 1}, {3, 2}, {3, 3}, {3, 4}, {3, 4},
{3, 5}, {3, 5}, {3, 10}}
Labels and payload are sorted into canonical order. The question of
stability doesn't arise.
Other forms of sort are possible, to better compare them, let's first make a
list of ranges for sorted labels:
In[3]:= ix = FoldList[Plus, 0, Length /@ Split @ Sort[s[[All, 1]]]];
In[4]:= rr = Transpose[{Drop[ix + 1, -1], Rest[ix]}]
Out[4]= {{1, 10}, {11, 22}, {23, 30}}
In[5]:= classes[sort_] := Flatten[Take[sort, #, {-1}]] & /@ rr
A function to get the classes.
This not the best, but a secure way to get a stable sort:
In[6]:=
ssstable = Join @@ (Cases[s, {#, _}] &) /@ Union[s[[All, 1]]]
Out[6]=
{{1, 8}, {1, 4}, {1, 1}, {1, 1}, {1, 5}, {1, 9}, {1, 1}, {1, 8}, {1, 10},
{1, 2}, {2, 10}, {2, 2}, {2, 4}, {2, 8}, {2, 5}, {2, 6}, {2, 3}, {2, 1},
{2, 9}, {2, 5}, {2, 4}, {2, 10}, {3, 3}, {3, 5}, {3, 5}, {3, 2}, {3, 4},
{3, 4}, {3, 10}, {3, 1}}
In[7]:= stableclasses = classes[ssstable]
Out[7]=
{{8, 4, 1, 1, 5, 9, 1, 8, 10, 2},
{10, 2, 4, 8, 5, 6, 3, 1, 9, 5, 4, 10},
{3, 5, 5, 2, 4, 4, 10, 1}}
These are the stable classes, ss1 had them sorted too (which you didn't
like).
In[8]:= classes[ss1]
Out[8]=
{{1, 1, 1, 2, 4, 5, 8, 8, 9, 10},
{1, 2, 3, 4, 4, 5, 5, 6, 8, 9, 10, 10},
{1, 2, 3, 4, 4, 5, 5, 10}}
In Mathematica we may sort only according to the labels (the first component
here)
In[9]:= ss2 = Sort[s, First[#1] < First[#2] &];
In[10]:= classes[ss2]
Out[10]=
{{2, 10, 8, 1, 9, 5, 1, 1, 4, 8},
{10, 4, 5, 9, 1, 3, 6, 5, 8, 4, 2, 10},
{1, 10, 4, 4, 2, 5, 5, 3}}
In[11]:= % === stableclasses
Out[11]= False
But that also isn't stable.
Let's check the case when using the ordering of the labels only
In[12]:= ss3 = s[[Ordering[s[[All, 1]] ] ]];
In[13]:= classes[ss3]
Out[13]=
{{8, 1, 1, 10, 4, 8, 5, 2, 1, 9},
{10, 1, 6, 9, 4, 4, 3, 5, 2, 10, 5, 8},
{3, 5, 5, 10, 4, 1, 2, 4}}
In[14]:= % === stableclasses
Out[14]= False
In[15]:= %% === classes[ss2]
Out[15]= False
How do we get a performant stable sort?
An idea is to tag the payload with their ordering and use simple Sort:
In[16]:=
s4 = Transpose[{s[[All, 1]], Range[Length[s]], s[[All, -1]]}];
In[17]:= ss4 = Sort[s4][[All, {1, -1}]]
Out[17]=
{{1, 8}, {1, 4}, {1, 1}, {1, 1}, {1, 5}, {1, 9}, {1, 1}, {1, 8}, {1, 10},
{1, 2}, {2, 10}, {2, 2}, {2, 4}, {2, 8}, {2, 5}, {2, 6}, {2, 3}, {2, 1},
{2, 9}, {2, 5}, {2, 4}, {2, 10}, {3, 3}, {3, 5}, {3, 5}, {3, 2}, {3, 4},
{3, 4}, {3, 10}, {3, 1}}
In[18]:= classes[ss4]
Out[18]=
{{8, 4, 1, 1, 5, 9, 1, 8, 10, 2},
{10, 2, 4, 8, 5, 6, 3, 1, 9, 5, 4, 10},
{3, 5, 5, 2, 4, 4, 10, 1}}
In[19]:= % === stableclasses
Out[19]= True
--
Hartmut Wolf
Prev by Date:
** webMathematica trouble - solution**
Next by Date:
**newbie graph question**
Previous by thread:
**Re: Is Sort stable?**
Next by thread:
**RE: Re: Is Sort stable?**
| |