RE: Re: Friendly Challenge 2: sort by column
- To: mathgroup at smc.vnet.net
- Subject: [mg35063] RE: [mg35029] Re: [mg35005] Friendly Challenge 2: sort by column
- From: "DrBob" <majort at cox-internet.com>
- Date: Thu, 20 Jun 2002 23:55:38 -0400 (EDT)
- Reply-to: <drbob at bigfoot.com>
- Sender: owner-wri-mathgroup at wolfram.com
Paul, I wonder what you mean by "handle duplicates". One interpretation (the one you mean, I think) is to sort by the other columns when two or more rows agree in the column being used for sorting. Another interpretation (the one I prefer) is to preserve preexisting order in case of ties. That allows us to sort by one column, then by another, and get the result we'd expect. After some experimentation, I found that hw4 and hw5 (and hence Ordering itself) preserve order already, without modification. THE DOCUMENTATION SHOULD BE CLEAR ON THIS! hw1[a_, p_] := Transpose[ RotateRight[Transpose[Sort[Transpose[RotateLeft[Transpose[a], p - 1]]]], p - 1]] hw2[a_, p_] := RotateRight[Sort[RotateLeft[a, {0, p - 1}]], {0, p - 1}] hw3[a_, p_] := Sort[a, #1[[p]] < #2[[p]] &] hw4[a_, p_] := a[[Ordering[Transpose[a][[p]]]]] hw5[a_, p_] := a[[Ordering[a[[All, p]]]]] bt1[a_, p_] := a[[Ordering[Transpose[{a[[All, p]], Range[Length[a]]}]]]] The following timings include recomputing hw5 for comparison each time. a = Table[Random[Integer, {0, 23}], {37}, {50}]; Timing[And @@ Flatten[Table[(a~#~i) == (a~hw5~i), {i, 2, 50}, {j, 1, 70}]]] & /@ {hw1, hw2, hw3, hw4, hw5, bt1} {{0.484 Second, False}, {0.25 Second, False}, {2.594 Second, False}, {0.25 Second, True}, {0.172 Second, True}, {0.203 Second, True}} a = Table[Random[], {500}, {3000}]; Timing[And @@ Flatten[Table[(a~#~i) == (a~hw5~i), {i, 2, 299, 11}]]] & /@ {hw1, hw2, hw3, hw4, hw5, bt1} {{9.218 Second, True}, {4.735 Second, True}, {19.937 Second, True}, {4.141 Second, True}, {2.969 Second, True}, {2.937 Second, True}} My function (bt1) is faster than the others except possibly hw5, but it's more complicated. It makes order-preservation a priority, so its agreement with hw5 shows that Ordering preserves preexisting order. Therefore, I would think hw5 is a clear winner so far. But wait, here are similar tests with larger matrices: a = Table[Random[Integer, {0, 255}], {373}, {501}]; Timing[And @@ Flatten[Table[(a~#~i) == (a~hw5~i), {i, 2, 50}, {j, 1, 70}]]] & /@ {hw5, bt1} {{12.906 Second, True}, {12.61 Second, True}} and on another trial: {{12.094 Second, True}, {12.953 Second, True}} a = Table[Random[], {5000}, {3000}]; Timing[And @@ Flatten[Table[(a~#~i) == (a~hw5~i), {i, 2, 299, 11}]]] & /@ {hw5, bt1} {{30.156 Second, True}, {29.828 Second, True}} a = Table[Random[], {5000}, {3000}]; Timing[And @@ Flatten[Table[(a~#~i) == (a~hw5~i), {i, 1, 3000, 113}]]] & /@ {hw5, bt1} {{28.219 Second, True}, {29.063 Second, True}} Timings for hw5 and bt1 are so similar that I suspect Ordering preserves order the same way I did. Bobby -----Original Message----- From: "Mr. Wizard" [mailto:gleam at flashmail.com] To: mathgroup at smc.vnet.net Subject: [mg35063] [mg35029] Re: [mg35005] Friendly Challenge 2: sort by column > consequently I should add > > (5) > > a[[Ordering[a[[All, 3]]]]] Hartmut Wolf, thank you for participating in this challenge. Your #5 is one of the algorithms I'm using. It's the fastest I've found for sorting by one column only, disregarding the rest. After moderate number of tests, not nearly up to your standards, but enough to get a good start, I am prepared to observe that of your functions that handle duplicates by using Rotate, your function #2 is fastest. (Incidentally, this is the duplicate handling method I first chose, and still prefer.) I have a function that, on my machine, is usually faster than #2, sometimes considerably. I have a feeling you can best it if you try. Examples: hw2[a_,p_]:=RotateRight[Sort[RotateLeft[a,{0,p-1}]],{0,p-1}] sort3[a_,p_]:= ??? (my function) In[284]:= a = Table[Random[Integer, {0, 255}], {37}, {50}]; First@Timing[Do[a~#~i, {i, 2, 50}]~Do~{70}] & /@ {sort3, hw2} Out[285]= {1.54 Second, 1.86 Second} In[286]:= a = Table[Random[], {500}, {300}]; First@Timing[Do[a~#~i, {i, 2, 299, 11}]] & /@ {sort3, hw2} Out[287]= {1.16 Second, 2.2 Second} Regards, Paul