Re: Re: easiest way to sort a list?

*To*: mathgroup at smc.vnet.net*Subject*: [mg18428] Re: [mg18344] Re: [mg18308] easiest way to sort a list?*From*: "Carl K.Woll" <carlw at fermi.phys.washington.edu>*Date*: Wed, 7 Jul 1999 00:11:18 -0400*Organization*: Department of Physics*References*: <199906260224.WAA14461@smc.vnet.net.> <199906301813.OAA02710@smc.vnet.net.> <377B2F0C.88E35994@gsmail01.darmstadt.dsh.de>*Sender*: owner-wri-mathgroup at wolfram.com

Hi Hartmut (and newsgroup), Thanks for unraveling the linear nature of the Block[{i... function. Nice explanation! Since this topic shows no sign of slowing down, I thought I would present a couple refined versions of my two functions. First, an improvement of the "linear" solution: OrderedUnion[li_]:=Block[{i,Sequence}, i[n_]:=(i[n]=Sequence[];n); i/@li] By including Sequence in the Block variables, Mathematica doesn't waste time eliminating Sequences until after the Block is exited. This seems to be a 5-10% or so improvement in speed. Secondly, an improvement of my O(n log n) solution: OrderedUnion[3][li_]:=Module[{tmp=Transpose[{li,Range[Length[li]]}]}, tmp=Union[tmp, SameTest->(#1[[1]]===#2[[1]]&)]; Last/@Sort[RotateLeft/@tmp]] The key idea here is that Sort with the default ordering function is much faster than Sort with a user supplied ordering function. So, RotateLeft switches the order of the index and the element to allow Sort to be used with its default ordering function. For lists with a large number of distinct elements, this seems to improve the speed by a factor of 3. Carl Woll Dept of Physics U of Washington Wolf, Hartmut wrote: > Hello everone, > > I just sent a letter to Carl K.Woll, part of which I would like to > publish to the community: > > Hello Carl, > > congratulations for your brilliant solution of "easiest way to sort a > list?" At the first glance, ... I was already struck. > > According to my measurements, you proposed not only the fastest > solution, but also the second fastest, the fastest O[n log n] version. > .... > > The question that concerned me was, is your solution really O[n]? My > preoccupation before was that O[n log n] could be the best achievable. > Clearly your solution is O[n] for 'smaller' samples, but the question > was, is it still O[n] for very large samples? So I did some > measurements, which I'd like to show you. They are in the two attached > notebooks (...not for you, unless you request it from > mailto:hwolf at debis.com ) > > What is the interpretation of the graphs? In fact your solution is as > far as I could calculate seemingly linear overall. There are interesting > wiggles in it. The algorithm seems to become increasingly worse, when it > makes up it's mind and "starts anew". Well I think this are the traces > of Mathematica's Extendible Hashing algorithm for the symbol store. When > hashing becomes more and more inefficient (due to collisions) then the > store ist extended. (This conforms to my observation of memory usage > with Windows NT Task Manager during the calculation, where I say periods > of rapid increase of used memory and times of slower grow.) So it's this > extension, which keeps your algorithm linear. > > .... > > With kind regards, your > Hartmut Wolf

**References**:**Re: easiest way to sort a list?***From:*"Carl K.Woll" <carlw@fermi.phys.washington.edu>