MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: reliably sort?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg123745] Re: reliably sort?
  • From: Ulrich Arndt <ulrich.arndt at data2knowledge.de>
  • Date: Sat, 17 Dec 2011 02:44:27 -0500 (EST)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com
  • References: <jca06e$a35$1@smc.vnet.net> <201112150955.EAA22951@smc.vnet.net> <201112161054.FAA06989@smc.vnet.net> <op.v6lg231jtgfoz2@bobbys-imac.local>

My statement might have been a bit misleading.
I didn't expected that mathematica is offering a pure mathematica set 
concept implementation and I saw also some ways how it can be build in 
mathematica.
But Union (Esc\[ThinSpace]un\[ThinSpace]Esc:) is indicating by its 
notation a set operation and there I was wondering why the function is 
overloaded with the additional sort.
The other thing which was puzzling me was the statement of "Mathematica 
obviously uses the most efficient way to perform the Union operation, 
which is to sort the entire set of elements..." which is performance 
wise not true for all sets as shown in the example. As I learned today 
DeleteDuplicates would have been a good candidate for the Union 
Implementation...

Ulrich

Am 16.12.2011 um 19:17 schrieb DrMajorBob:

> Union COULD be a set operation... if sets existed in Mathematica. The 
output may LOOK like a set to you, but that's a mirage.
>
>> But there are other implementations which can be used for this set
>> operation which do not use sorting
>
> Ah, but you DID sort b (in order to make the results equal), whereas 
Union sorted the list before eliminating duplicates.
>
> Yes, other implementations exist that don't involve sorting... but 
they're not built-in to Mathematica, and you haven't demonstrated one.
>
> Bobby
>
> On Fri, 16 Dec 2011 04:54:15 -0600, Ulrich Arndt 
<ulrich.arndt at data2knowledge.de> wrote:
>
>> Union and Sort seems not naturally connected to me. Union is a set
>> operation and according to the documentation it seems that the
>> implementation is using some sorting...
>> But there are other implementations which can be used for this set
>> operation which do not use sorting and seems to have some better
>> performance in some cases - will clearly depend on the underlying 
data!
>>
>> In[169]: dim = 1000000;
>> list = RandomChoice[Range[299], dim];
>> (a = Union[list]); // Timing
>> (b = Tally[list][[All, 1]]); // Timing
>> a == Sort[b]
>>
>> Out[171]= {0.222803, Null}
>>
>> Out[172]= {0.003932, Null}
>>
>> Out[173]= True
>>
>> Ulrich
>>
>> Am 15.12.2011 um 10:55 schrieb David Bailey:
>>
>>> On 14/12/2011 11:10, Michael Stern wrote:
>>>> I have noticed that if combine two lists of date/data objects using
>>>> Union[], the result is sorted by date.
>>>>
>>>> For example
>>>>
>>>> In[]:= mins={{{2001, 1, 31}, 0.993268}, {{2002, 3, 31}, 
1.01395},
>>>> {{2003, 6, 30},
>>>>   1.08647}, {{2005, 11, 30}, 1.14752}, {{2006, 9, 30},
>>>>   1.18938}, {{2007, 2, 28}, 1.19658}, {{2008, 1, 31},
>>>>   1.20432}, {{2011, 1, 31}, 1.37501}};
>>>> In[]:= maxs={{{2000, 10, 31}, 1.01816}, {{2001, 12, 31}, 
>> 1.02714},
>>>> {{2004, 2, 29},
>>>>    1.12702}, {{2005, 3, 31}, 1.16986}, {{2010, 10, 31}, 1.39026}};
>>>>
>>>> In[]:= Union[maxs, mins]
>>>>
>>>> Out[]= {{{2000, 10, 31}, 1.01816}, {{2001, 1, 31},
>>>>   0.993268}, {{2001, 12, 31}, 1.02714}, {{2002, 3, 31},
>>>>   1.01395}, {{2003, 6, 30}, 1.08647}, {{2004, 2, 29},
>>>>   1.12702}, {{2005, 3, 31}, 1.16986}, {{2005, 11, 30},
>>>>   1.14752}, {{2006, 9, 30}, 1.18938}, {{2007, 2, 28},
>>>>   1.19658}, {{2008, 1, 31}, 1.20432}, {{2010, 10, 31},
>>>>   1.39026}, {{2011, 1, 31}, 1.37501}}
>>>>
>>>> Is this reliable behavior? Is there any case in which Union would 
/not/
>>>> return results that have been properly sorted by date?
>>>>
>>>> Thank you,
>>>>
>>>> -Michael Stern
>>> Sorting is part of the operation, and you can rely on the result 
being
>>> sorted.
>>>
>>> Mathematica obviously uses the most efficient way to perform the 
Union
>>> operation, which is to sort the entire set of elements, and then 
perform
>>> a scan through the elements, discarding adjacent duplicates.
>>>
>>> David Bailey
>>> http://www.dbaileyconsultancy.co.uk
>>>
>>>
>>
>>
>
>
> --
> DrMajorBob at yahoo.com







  • Prev by Date: Re: reliably sort?
  • Next by Date: Re: NMinimize problem: fct minimized uses FindRoot
  • Previous by thread: Re: reliably sort?
  • Next by thread: R: Re: reliably sort?