Re: Re: how to rank a list of elements?

*To*: mathgroup at smc.vnet.net*Subject*: [mg23255] Re: [mg23198] Re: how to rank a list of elements?*From*: Hartmut Wolf <hwolf at debis.com>*Date*: Sat, 29 Apr 2000 22:04:59 -0400 (EDT)*Organization*: debis Systemhaus*References*: <8djk08$7dc@smc.vnet.net> <8dnuge$hrl@smc.vnet.net> <200004240512.BAA15232@smc.vnet.net>*Sender*: owner-wri-mathgroup at wolfram.com

My answer below: Wen-Feng Hsiao schrieb: > > On Thu, 20 Apr 2000 08:51:23 +0200 > Hartmut Wolf <hwolf at debis.com> wrote: > > > Dear Wen-Feng, > > > > Since you didn't state your problem but only a solution, it's hard to > > guess. (It would have been helpful to have stated the "real world > > problem"; often the mapping of that to the mathematical model -- or > > object model for programming -- is the most decisive step to a > > solution.) > > > > Anyways, you have to answer yourself the question: why not have > > > > {4, 2, 3, 1} or > > {3, 2, 3, 1} or even > > {{3, 4}, 2, {3, 4}, 1} or > > {{3, 4}, 2, see[1], 1} > > > > as a result?? (It also depends on what you intend to do with it.) > My question is quite simple, just "to rank a list of elements (scores)". > This is a common situation when the obtained scores are not reliable to > be an interval scale, the usual procedure is to transform them into an > ordinal scale before conducting statistical analyses. For example, in an > IQ > test, we obtain the following scores: > {107, 110, 110, 120, 125, 125, 125, 136} > , which can be ranked as > > {1., 2.5, 2.5, 4., 6., 6., 6., 8.}, if we use 'mean' to deal with ties; > or > {1., 2., 2., 4., 5., 5., 5., 8.}, if we use 'min' to deal with ties; > or > {1., 3., 3., 4., 7., 7., 7., 8.}, if we use 'max' to deal with ties. > > By intuition, same scores should have same ranks. However, your > illustrations are, though sophisticated, not consistent with this > intuition. Allan Hayes's code is more likely as what I want. From his > code I know I can use 'Rule' to get what I want: (Please correct me if > anything needs to be improved.) > > << Statistics` > (* ties = -1, low; ties = 0, mean; ties = 1, high *) > ranking[datalst_List, ties_:0] := Module[{ratings, posset, substitutes}, > methods = {Min, Mean, Max}; > posset = > Position[Sort[datalst], #] & /@ Sort[datalst] /. {lst_} -> lst; > ratings = methods[[ties + 2]][#] & /@ posset // N; > substitutes = Apply[Rule, Transpose[{Sort[datalst], ratings}], > {1}]; > Print[substitutes]; > Return[datalst /. substitutes]; > ]; > > In[74]:= > ranking[dataset] > Out[74]= > {1., 2.5, 2.5, 4., 6., 6., 6., 8.} > > In[75]:= > ranking[dataset, -1] > Out[75]= > {1., 2., 2., 4., 5., 5., 5., 8.} > > In[76]:= > ranking[dataset, 1] > Out[76]= > {1., 3., 3., 4., 7., 7., 7., 8.} > Dear Wen-Feng, let's step through your code: In[1]:= << Statistics`DescriptiveStatistics` In[2]:= score = { 110, 120, 110, 125, 125, 105, 136, 125}; I assume here you have 8 participants of your contest, so #5 scored 125. In[16]:= iq = Sort[score, ! OrderedQ[{#1, #2}] &] Out[16]= {136, 125, 125, 125, 120, 110, 110, 105} The scores in order (I assume here highest is best). In[17]:= Position[iq, #] & /@ iq Out[17]= {{{1}}, {{2}, {3}, {4}}, {{2}, {3}, {4}}, {{2}, {3}, {4}}, {{5}}, {{6}, {7}}, {{6}, {7}}, {{8}}} In[18]:= posset = % /. {e_Integer} :> e Out[18]= {{1}, {2, 3, 4}, {2, 3, 4}, {2, 3, 4}, {5}, {6, 7}, {6, 7}, {8}} The grouping of the ranks (some redundancy already is visible). In[19]:= ratings = N[Mean[#]] & /@ posset Out[19]= {1., 3., 3., 3., 5., 6.5, 6.5, 8.} The ranks after tie resolution in sequential order. In[20]:= substitutes = Apply[Rule, Transpose[{iq, ratings}], {1}] Out[20]= {136 -> 1., 125 -> 3., 125 -> 3., 125 -> 3., 120 -> 5., 110 -> 6.5, 110 -> 6.5, 105 -> 8.} The association of the scores to the ranks (done on sorted scores and sorted ranks). In[21]:= ranks = score /. substitutes Out[21]= {6.5, 5., 6.5, 3., 3., 8., 1., 3.} This now gives the ranks at the position of the scores, so participant #5 scored as 3. (third, together with 2 others). You may observe, that you associated scores to ranks, when both were sorted; such you had to transfer it back to the original order (of the contestants) through substitution. This is not neccessary, you can simplify: In[22]:= iq = Sort[score, ! OrderedQ[{#1, #2}] &] Out[22]= {136, 125, 125, 125, 120, 110, 110, 105} In[23]:= Position[iq, #] & /@ score Out[23]= {{{6}, {7}}, {{5}}, {{6}, {7}}, {{2}, {3}, {4}}, {{2}, {3}, {4}}, {{8}}, {1}}, {{2}, {3}, {4}}} Here you keep the original order (of the contestants). You only have to apply tie resolution: In[24]:= ranks = N[Mean[Flatten[#]]] & /@ % Out[24]= {6.5, 5., 6.5, 3., 3., 8., 1., 3.} This seems to be "elegant", the work effectively is done by Position. Yet there is a point of suspicion: the repeated scanning through iq to determine the (raw) rankings, is it performant? To test later, we wrap up this code into a function: In[25]:= fwf[score_] := With[{iq = Sort[score, ! OrderedQ[{#1, #2}] &]}, N[Mean[Flatten[#]]] & /@ (Position[iq, #] &) /@ score] In[26]:= fwf[score] Out[26]= {6.5, 5., 6.5, 3., 3., 8., 1., 3.} In my previous posting I showed you step by step a performant method of sorting while _keeping_ the association; let's try it here: In[3]:= n = Length[score]; In[4]:= scid = Transpose[{score, Range[n]}] Out[4]= {{110, 1}, {120, 2}, {110, 3}, {125, 4}, {125, 5}, {105, 6}, {136, 7}, {125, 8}} Explicit association of score to contestant #; In[5]:= Sort[scid, ! OrderedQ[{#1, #2}] &] Out[5]= {{136, 7}, {125, 8}, {125, 5}, {125, 4}, {120, 2}, {110, 3}, {110, 1}, {105, 6}} ordered by score; In[6]:= {iq, idpos} = Transpose[%] Out[6]= {{136, 125, 125, 125, 120, 110, 110, 105}, {7, 8, 5, 4, 2, 3, 1, 6}} idpos are the contestants ordered by ranking (sauf tie resolution); In[7]:= Transpose[{iq, Range[n]}] Out[7]= {{136, 1}, {125, 2}, {125, 3}, {125, 4}, {120, 5}, {110, 6}, {110, 7}, {105, 8}} association of scores to (raw) ranks; In[8]:= Split[%, #1[[1]] === #2[[1]] & ] Out[8]= {{{136, 1}}, {{125, 2}, {125, 3}, {125, 4}}, {{120, 5}}, {{110, 6}, {110, 7}}, {{105, 8}}} In[9]:= Map[Last, %, {2}] Out[9]= {{1}, {2, 3, 4}, {5}, {6, 7}, {8}} the ranks grouped with ties; In[10]:= Table[Evaluate[N[Mean[#]]], {Length[#]}] & /@ % Out[10]= {{1.}, {3., 3., 3.}, {5.}, {6.5, 6.5}, {8.}} tie resolution applied; In[11]:= Transpose[{idpos, Flatten[%]}] Out[11]= {{7, 1.}, {8, 3.}, {5, 3.}, {4, 3.}, {2, 5.}, {3, 6.5}, {1, 6.5}, {6, 8.}} associated with contestants (in rank order) In[12]:= Sort[%] Out[12]= {{1, 6.5}, {2, 5.}, {3, 6.5}, {4, 3.}, {5, 3.}, {6, 8.}, {7, 1.}, {8, 3.}} In[13]:= rank = Transpose[%][[2]] Out[13]= {6.5, 5., 6.5, 3., 3., 8., 1., 3.} sorted back into order of contestants, and ranks extracted. This _appears_ to be more complicated than your (improved) procedure. But this is only because programming is done at lower level. The gain may be improved performance. Again we make a function: In[14]:= fal[score_] := Module[ {n = Length[score], iq, idpos, rnkspl, tiernk}, {iq, idpos} = Transpose[Sort[Transpose[{score, Range[n]}], ! OrderedQ[{#1, #2}] &]]; rnkspl = Map[Last, Split[Transpose[{iq, Range[n]}], #1[[1]] === #2[[1]] & ], {2}]; tiernk = Table[Evaluate[N[Mean[#]]], {Length[#]}] & /@ rnkspl; Transpose[Sort[Transpose[{idpos, Flatten[tiernk]}]]][[2]] ] In[15]:= fal[score] Out[15]= {6.5, 5., 6.5, 3., 3., 8., 1., 3.} Now we compare runtimes: In[27]:= {fwftim, faltim} = Module[{tim = {}}, With[{s = Table[Random[Integer, {0, 200}], {#}]}, AppendTo[tim, Part[#, 1, 1] & /@ List @@ Timing /@ Hold[fwf[s], fal[s]] ]] & /@ (250 Range[8]); Transpose[tim]] Out[27]= {{0.41, 1.101, 2.023, 3.205, 4.647, 6.239, 8.161, 10.275}, {0.251, 0.481, 0.741, 0.951, 1.251, 1.463, 1.713, 1.972}} You may visualize In[28]:= << Graphics`MultipleListPlot` In[29]:= MultipleListPlot[fwftim, faltim, PlotJoined -> True, PlotRange -> {{0, Automatic}, Automatic}]; So if the contest comprizes your whole campus, you might like to switch methods. Kind regards, Hartmut

**References**:**Re: how to rank a list of elements?***From:*d8442803@student.nsysu.edu.tw (Wen-Feng Hsiao)