results of pair creation contest
- To: mathgroup at smc.vnet.net
- Subject: [mg23374] results of pair creation contest
- From: Wijnand Schepens <Wijnand.Schepens at rug.ac.be>
- Date: Fri, 5 May 2000 02:07:07 -0400 (EDT)
- Organization: RUG
- Sender: owner-wri-mathgroup at wolfram.com
Hi groupies, This is an overview of 8 different methods to split a list into a list of pairs. It is largely the result of the discussions on this mathgroup. Thanks to everyone who invested time in this. Example of pair creation: {a,b,c} -> {{a,b},{a,c},{b,c}} Every pair must only appear once (i.e. {a,b} is considered to be equivalent to {b,a}. The order of the pairs is not very important, as long as the correct (n-1)/2 pairs are returned. ------------------------------------------------- 8 selected methods ------------------------------------------------- Trivial implementation using Append: In[53]:= divpairs1[lst_List] := Module[{r = {}}, Do[Do[ r = Append[r, lst[[{j, i}]] ], {j,1, i - 1}], {i, 2, Length[lst]}]; r] Using Table: In[54]:= divpairs2[lst_List] := Table[Sequence @@ Table[lst[[{j, i}]], {j,1,i - 1}], {i,2,Length[lst]}] In[55]:= divpairs3[lst_List] := (Sequence @@ #1) & /@ Table[lst[[{j,i}]],{i, 2, Length[lst]}, {j, 1, i - 1}] In[56]:= divpairs4[lst_List] := Join @@ Table[lst[[{j, i}]], {i, 2, Length[lst]}, {j, 1, i - 1}] Mark Harder's method: In[57]:= divpairs5[lst_List] := Flatten[Table[lst[[{j, i}]], {i, 2, Length[lst]}, {j, 1, i - 1}], 1] Using a mask (Carl Woll's method): In[58]:= mask[n_] := Module[{r=Range[n - 1]},Flatten[Range @@@ Transpose[{r(n + 1) - (n - 1), r n}]]] In[59]:= divpairs6[lst_List] := Module[{m, allpairs}, m = mask[Length[lst]]; allpairs = Flatten[Outer[List, lst, lst], 1]; allpairs[[m]] ] Hartmut Wolf's method: In[60]:= divpairs7[lst_List] := Module[{l = Length[lst]}, Join@@Array[If[#1 < #2, lst[[{#1, #2}]], Unevaluated[Sequence[] ] ] &,{l, l}] ] Allan Hayes' method, based on an idea of Wagner Truppel (?): In[61]:= divpairs8[lst_List] := Transpose[ (Apply[Join, NestList[#, #[lst], Length[lst] - 2] ]) & /@ {Drop[#, -1] &, Drop[#, 1] &} ] ------------------------------------------------- Examples ------------------------------------------------- All methods produce the same results (possibly in different order) when applied to a (non-nested) list, e.g. In[62]:= divpairs1[{a, b, c, d}] Out[62]= {{a, b}, {a, c}, {b, c}, {a, d}, {b, d}, {c, d}} Note: divpairs6 produces undesired results when applied to a nested list: This is what the other methods return (and what we want): In[70]:= divpairs1[{{a1, a2}, {b1, b2}, {c1, c2}}] Out[70]= {{{a1, a2}, {b1, b2}}, {{a1, a2}, {c1, c2}}, {{b1, b2}, {c1, c2}}} This is the result of divpairs6 (not what we want): In[75]:= divpairs6[{{a1, a2}, {b1, b2}, {c1, c2}}] Out[75]= {{{{a2, a1}, {a2, a2}}, {{a2, b1}, {a2, b2}}, {{a2, c1}, {a2, c2}}}, {{{b1, a1}, {b1, a2}}, {{b1, b1}, {b1, b2}}, {{b1, c1}, {b1, c2}}}, {{{c2, a1}, {c2, a2}}, {{c2, b1}, {c2, b2}}, {{c2, c1}, {c2, c2}}}} We'll consider divpairs6 in the efficiency-tests anyway ------------------------------------------------- Efficiency: ------------------------------------------------- I used my own procedure MultiTiming to time the duration of a calculation. It first does the calculation two times, neglecting the timing (which can be abnormally high because of first tries), then it does n more timings, and returns {min,mean,max} or {best case, mean case, worst case} of these n timings. SetAttributes[MultiTiming, HoldAll]; MultiTiming[f_, n_] := Module[{data}, Timing[f;]; Timing[f;]; data = Table[Timing[f;][[1, 1]], {n}]; {Min[data], Statistics`DescriptiveStatistics`Mean[data], Max[data]}] Now we try all 8 methods on a list of 50 random numbers: In[78]:= n = 50; t = Table[Random[], {n}]; In[94]:= {n, MultiTiming[divpairs1[t], 5], MultiTiming[divpairs2[t], 5], MultiTiming[divpairs3[t], 5] , MultiTiming[divpairs4[t], 5], MultiTiming[divpairs5[t], 5], MultiTiming[divpairs6[t], 5], MultiTiming[divpairs7[t], 5], MultiTiming[divpairs8[t], 5]} Out[94]= {50, {1.26, 1.274, 1.31}, {0.05, 0.056, 0.06}, {0.05, 0.056, 0.06}, {0., 0.044, 0.06}, {0., 0.044, 0.06}, {0., 0.032, 0.06}, {0.11, 0.154, 0.17}, {0., 0.022, 0.06}} Apparently the first method divpairs1 is very slow (as expected). We leave it out for the next timings. divpairs7 is also somewhat slower than the others. divpairs8 yields the best average. Now for n=100 (excluding divpairs1): In[95]:= n = 100; t = Table[Random[], {n}]; In[97]:= {n, MultiTiming[divpairs2[t], 5], MultiTiming[divpairs3[t], 5] , MultiTiming[divpairs4[t], 5], MultiTiming[divpairs5[t], 5], MultiTiming[divpairs6[t], 5], MultiTiming[divpairs7[t], 5], MultiTiming[divpairs8[t], 5]} Out[97]= {100, {0.16, 0.198, 0.22}, {0.16, 0.186, 0.22}, {0.16, 0.186, 0.22}, {0.16, 0.186, 0.22}, {0.11, 0.132, 0.17}, {0.6, 0.614, 0.66}, {0., 0.044, 0.06}} Results: divpairs2,3,4 and 5 are rougly equivalent. divpairs6 is a bit faster divpairs7 is slow divpairs8 is clearly the fastest For n=200 we get (excluding 1) {200, {0.71, 0.758, 0.83}, {0.76, 0.768, 0.77}, {0.71, 0.736, 0.77}, {0.71, 0.746, 0.77}, {0.55, 0.55, 0.55}, {2.42, 2.45, 2.47}, {0.05, 0.088, 0.16}} divpairs8 outbeats divpairs6 (factor 3 worst case, factor 11 best case), which in turn is better than the others. Remember also how divpairs6 didn't produce the desired result when applied to nested lists. For n=400 we get (excluding 1) {400, {2.25, 2.79, 4.17}, {3.19, 3.384, 3.46}, {2.96, 3.218, 3.3}, {3.02, 3.264, 3.35}, {2.25, 2.506, 2.92}, {10.38, 29.668, 51.57}, {0.33, 0.384, 0.6}} divpairs8 is on average more than 6 times faster than divpairs6. For n this large, we see that divpairs1 approaches divpairs6 (at least in best case)! That'll be all. Conclusion: Allan Hayes' method is great. Considering the fact that it is not obvious at all (I got lots of proposals, of which the 8 cases above are the best), I suggest this routine be made a part of the Mathematica system (add-ons...) Is this the definitive version? Wijnand Schepens, University of Ghent, Belgium