MathGroup Archive 2000

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: how to rank a list of elements?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg23255] Re: [mg23198] Re: how to rank a list of elements?
  • From: Hartmut Wolf <hwolf at debis.com>
  • Date: Sat, 29 Apr 2000 22:04:59 -0400 (EDT)
  • Organization: debis Systemhaus
  • References: <8djk08$7dc@smc.vnet.net> <8dnuge$hrl@smc.vnet.net> <200004240512.BAA15232@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

My answer below:

Wen-Feng Hsiao schrieb:
> 
> On Thu, 20 Apr 2000 08:51:23 +0200
> Hartmut Wolf <hwolf at debis.com> wrote:
> 
> > Dear Wen-Feng,
> >
> > Since you didn't state your problem but only a solution, it's hard to
> > guess. (It would have been helpful to have stated the "real world
> > problem"; often the mapping of that to the mathematical model -- or
> > object model for programming -- is the most decisive step to a
> > solution.)
> >
> > Anyways, you have to answer yourself the question: why not have
> >
> >       {4, 2, 3, 1} or
> >       {3, 2, 3, 1} or even
> >       {{3, 4}, 2, {3, 4}, 1} or
> >       {{3, 4}, 2, see[1], 1}
> >
> > as a result?? (It also depends on what you intend to do with it.)
> My question is quite simple, just "to rank a list of elements (scores)".
> This is a common situation when the obtained scores are not reliable to
> be an interval scale, the usual procedure is to transform them into an
> ordinal scale before conducting statistical analyses. For example, in an
> IQ
> test, we obtain the following scores:
> {107, 110, 110, 120, 125, 125, 125, 136}
> , which can be ranked as
> 
> {1., 2.5, 2.5, 4., 6., 6., 6., 8.}, if we use 'mean' to deal with ties;
> or
> {1., 2., 2., 4., 5., 5., 5., 8.}, if we use 'min' to deal with ties;
> or
> {1., 3., 3., 4., 7., 7., 7., 8.}, if we use 'max' to deal with ties.
> 
> By intuition, same scores should have same ranks. However, your
> illustrations are, though sophisticated, not consistent with this
> intuition. Allan Hayes's code is more likely as what I want. From his
> code I know I can use 'Rule' to get what I want: (Please correct me if
> anything needs to be improved.)
> 
> << Statistics`
> (* ties = -1, low; ties = 0, mean; ties = 1, high *)
> ranking[datalst_List, ties_:0] := Module[{ratings, posset, substitutes},
>       methods = {Min, Mean, Max};
>       posset =
>         Position[Sort[datalst], #] & /@ Sort[datalst] /. {lst_} -> lst;
>       ratings = methods[[ties + 2]][#] & /@ posset // N;
>       substitutes = Apply[Rule, Transpose[{Sort[datalst], ratings}],
> {1}];
>       Print[substitutes];
>       Return[datalst /. substitutes];
>       ];
> 
> In[74]:=
> ranking[dataset]
> Out[74]=
> {1., 2.5, 2.5, 4., 6., 6., 6., 8.}
> 
> In[75]:=
> ranking[dataset, -1]
> Out[75]=
> {1., 2., 2., 4., 5., 5., 5., 8.}
> 
> In[76]:=
> ranking[dataset, 1]
> Out[76]=
> {1., 3., 3., 4., 7., 7., 7., 8.}
> 

Dear Wen-Feng,

let's step through your code:

In[1]:= << Statistics`DescriptiveStatistics`
In[2]:= score = { 110, 120, 110, 125, 125, 105, 136, 125};

I assume here you have 8 participants of your contest, so #5 scored 125.

In[16]:= iq = Sort[score, ! OrderedQ[{#1, #2}] &]
Out[16]= {136, 125, 125, 125, 120, 110, 110, 105}

The scores in order (I assume here highest is best).

In[17]:= Position[iq, #] & /@ iq
Out[17]=
{{{1}}, {{2}, {3}, {4}}, {{2}, {3}, {4}}, {{2}, {3}, {4}}, {{5}}, {{6},
{7}}, {{6}, {7}}, {{8}}}

In[18]:= posset = % /. {e_Integer} :> e
Out[18]=
{{1}, {2, 3, 4}, {2, 3, 4}, {2, 3, 4}, {5}, {6, 7}, {6, 7}, {8}}

The grouping of the ranks (some redundancy already is visible).

In[19]:= ratings = N[Mean[#]] & /@ posset
Out[19]=
{1., 3., 3., 3., 5., 6.5, 6.5, 8.}

The ranks after tie resolution in sequential order.

In[20]:=
substitutes = Apply[Rule, Transpose[{iq, ratings}], {1}]
Out[20]=
{136 -> 1., 125 -> 3., 125 -> 3., 125 -> 3., 120 -> 5., 110 -> 6.5, 
  110 -> 6.5, 105 -> 8.}

The association of the scores to the ranks (done on sorted scores and
sorted ranks).

In[21]:= ranks = score /. substitutes
Out[21]= {6.5, 5., 6.5, 3., 3., 8., 1., 3.}

This now gives the ranks at the position of the scores, so participant
#5 scored as 3. (third, together with 2 others).


You may observe, that you associated scores to ranks, when both were
sorted; such you had to transfer it back to the original order (of the
contestants) through substitution. This is not neccessary, you can
simplify:

In[22]:= iq = Sort[score, ! OrderedQ[{#1, #2}] &]
Out[22]= {136, 125, 125, 125, 120, 110, 110, 105}

In[23]:= Position[iq, #] & /@ score
Out[23]=
{{{6}, {7}}, {{5}}, {{6}, {7}}, {{2}, {3}, {4}}, {{2}, {3}, {4}}, {{8}},
{1}}, {{2}, {3}, {4}}}

Here you keep the original order (of the contestants). You only have to
apply tie resolution:

In[24]:= ranks = N[Mean[Flatten[#]]] &  /@ %
Out[24]= {6.5, 5., 6.5, 3., 3., 8., 1., 3.}


This seems to be "elegant", the work effectively is done by Position.
Yet there is a point of suspicion: the repeated scanning through iq to
determine the (raw) rankings, is it performant?

To test later, we wrap up this code into a function:

 
In[25]:= fwf[score_] := With[{iq = Sort[score, ! OrderedQ[{#1, #2}] &]},
		N[Mean[Flatten[#]]] &  /@ (Position[iq, #] &) /@ score]

In[26]:= fwf[score]
Out[26]= {6.5, 5., 6.5, 3., 3., 8., 1., 3.}


In my previous posting I showed you step by step a performant method of
sorting while _keeping_ the association; let's try it here:

 
In[3]:= n = Length[score];
In[4]:= scid = Transpose[{score, Range[n]}]
Out[4]=
{{110, 1}, {120, 2}, {110, 3}, {125, 4}, {125, 5}, {105, 6}, {136, 7},
{125, 
    8}}

Explicit association of score to contestant #;

In[5]:= Sort[scid, ! OrderedQ[{#1, #2}] &]
Out[5]=
{{136, 7}, {125, 8}, {125, 5}, {125, 4}, {120, 2}, {110, 3}, {110, 1},
{105, 
    6}}

ordered by score;

In[6]:= {iq, idpos} = Transpose[%]
Out[6]=
{{136, 125, 125, 125, 120, 110, 110, 105}, {7, 8, 5, 4, 2, 3, 1, 6}}

idpos are the contestants ordered by ranking (sauf tie resolution);

In[7]:= Transpose[{iq, Range[n]}]
Out[7]=
{{136, 1}, {125, 2}, {125, 3}, {125, 4}, {120, 5}, {110, 6}, {110, 7},
{105, 
    8}}

association of scores to (raw) ranks;

In[8]:= Split[%, #1[[1]] === #2[[1]] & ]
Out[8]=
{{{136, 1}}, {{125, 2}, {125, 3}, {125, 4}}, {{120, 5}}, {{110, 6},
{110, 
      7}}, {{105, 8}}}

In[9]:= Map[Last, %, {2}]
Out[9]= {{1}, {2, 3, 4}, {5}, {6, 7}, {8}} 

the ranks grouped with ties;

In[10]:= Table[Evaluate[N[Mean[#]]], {Length[#]}] & /@ %
Out[10]= {{1.}, {3., 3., 3.}, {5.}, {6.5, 6.5}, {8.}}

tie resolution applied;

In[11]:= Transpose[{idpos, Flatten[%]}]
Out[11]= {{7, 1.}, {8, 3.}, {5, 3.}, {4, 3.}, {2, 5.}, {3, 6.5}, {1,
6.5}, {6, 8.}}

associated with contestants (in rank order)

In[12]:= Sort[%]
Out[12]=
{{1, 6.5}, {2, 5.}, {3, 6.5}, {4, 3.}, {5, 3.}, {6, 8.}, {7, 1.}, {8,
3.}}

In[13]:= rank = Transpose[%][[2]]
Out[13]= {6.5, 5., 6.5, 3., 3., 8., 1., 3.}

sorted back into order of contestants, and ranks extracted.

This _appears_ to be more complicated than your (improved) procedure.
But this is only because programming is done at lower level. The gain
may be improved performance. Again we make a function:

In[14]:= 
fal[score_] := Module[
 {n = Length[score], iq, idpos, rnkspl, tiernk},
 {iq, idpos} = 
 Transpose[Sort[Transpose[{score, Range[n]}], ! OrderedQ[{#1, #2}] &]];
 rnkspl = Map[Last, 
        Split[Transpose[{iq, Range[n]}], #1[[1]] === #2[[1]] & ], {2}];
 tiernk = Table[Evaluate[N[Mean[#]]], {Length[#]}] & /@ rnkspl;
 Transpose[Sort[Transpose[{idpos, Flatten[tiernk]}]]][[2]]
]

In[15]:= fal[score]
Out[15]= {6.5, 5., 6.5, 3., 3., 8., 1., 3.}


Now we compare runtimes:

In[27]:= {fwftim, faltim} = 
  Module[{tim = {}}, With[{s = Table[Random[Integer, {0, 200}], {#}]},
    AppendTo[tim, Part[#, 1, 1] & /@ 
       List @@ Timing /@ Hold[fwf[s], fal[s]] ]] & /@ (250 Range[8]); 
    Transpose[tim]]

Out[27]= {{0.41, 1.101, 2.023, 3.205, 4.647, 6.239, 8.161, 10.275}, 
          {0.251, 0.481, 0.741, 0.951, 1.251, 1.463, 1.713, 1.972}}

You may visualize

In[28]:= << Graphics`MultipleListPlot`
In[29]:= MultipleListPlot[fwftim, faltim, PlotJoined -> True, 
    PlotRange -> {{0, Automatic}, Automatic}];

So if the contest comprizes your whole campus, you might like to switch
methods.

Kind regards,  Hartmut


  • Prev by Date: more on vector unions
  • Next by Date: Re: Q: PlotPoints has dramatic timing effect (and not trivialy)
  • Previous by thread: Re: how to rank a list of elements?
  • Next by thread: Re: how to rank a list of elements?