Re: Re: easiest way to sort a list?
- To: mathgroup at smc.vnet.net
- Subject: [mg18401] Re: [mg18322] Re: [mg18308] easiest way to sort a list?
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Wed, 7 Jul 1999 00:11:02 -0400
- Organization: Wolfram Research, Inc.
- References: <7lccua$1gg@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Andrzej Kozlowski wrote: > > This is very efficient indeed and beutifully illustrates the usefullnes of > the amazing Sequence function. The only slight weakness that I can see is > when dealing with very long lists with a small number of distinct elements. > Since we know that the final output must have the same number of elements as > given by the (very fast) built-in Union function, for such lists one may do > better by a code which stops checking elements of li once it has constructed > a list of length Length[Union[li]]. Here is a modified code that does this: > > In[2]:= > OrderedUnion1[li_]:=Block[{i,counter=0,l0={},m=Length[Union[li]]}, > i[n_]:=(i[n]=Sequence[];counter=counter+1;n); > Scan[If[counter<m,l0={l0,i[#]},Return[Flatten[l0]]]&,li];Flatten[l0]] > > On very long lists with a small number of distinct elements this is somewhat > faster than OrderedUnion: > > In[3]:= > list1=Table[Random[Integer,{1,9}],{10000}]; > > In[4]:= > OrderedUnion[list1];//Timing > Out[4]= > {0.166667 Second, Null} > > In[5]:= > OrderedUnion1[list1];//Timing > Out[5]= > {0.0833333 Second, Null} > > (Mac PowerBook G3 233 Mhz). > > However, in the case of lists with very few or no repetitions the balance is > naturally reversed: > > In[6]:= > list2=Range[1000]; > > In[7]:= > OrderedUnion[list2];//Timing > Out[7]= > {0.15 Second, Null} > > In[8]:= > OrderedUnion1[list2];//Timing > Out[8]= > {0.25 Second, Null} Use of Union or Sort will make the asymptotic complexity worse regardless of size of result, but the crossover may be so large when dealing with lists of machine integers or machine reals that it hardly matters, at least in version 4. I'd not be surprised if it is on the order of a few million. > What's worse, on my Mac if I try to apply OrderedUnion1 to > list2=Range[10000] I get a crash, caused presumably by the Kernel running > out of memory while building up a huge nested list. This problem was brought to my attention by Colin Rose. I need to look into it. > For those with a taste for weird Mathematica programs here is an unusual > (and unfortunately inefficient) example: > > In[10]:= > OrderedUnion2[l_]:=Module[{a},Reverse[GroebnerBasis[Map[a,l]]]/.a[k_]->k] > > e.g. > > In[11]:= > OrderedUnion2[{3,2,4,5,a,a,3,6,5,2,8}] > Out[11]= > {3, 2, 4, 5, a, 6, 8} This relies on the fact that GroebnerBasis orders its variables as they appear in the input. Not really safe to rely on this as it is not documented behavior. Specifically, default ordering is unspecified. > -- > Andrzej Kozlowski > Toyama International University > JAPAN > http://sigma.tuins.ac.jp > http://eri2.tuins.ac.jp > Daniel Lichtblau Wolfram Research PS I agree Carl Woll's method is really nice. I played around with the problem before I saw his, also had an O(n+m) method where n = size in and m = size out. His method gave timings consistently around 2/3 of mine on large problems.