Re: reliably sort?
- To: mathgroup at smc.vnet.net
- Subject: [mg123754] Re: reliably sort?
- From: DrMajorBob <btreat1 at austin.rr.com>
- Date: Sat, 17 Dec 2011 02:47:34 -0500 (EST)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- References: <jca06e$a35$1@smc.vnet.net> <201112150955.EAA22951@smc.vnet.net>
- Reply-to: drmajorbob at yahoo.com
> But Union (Esc\[ThinSpace]un\[ThinSpace]Esc:) is indicating by its
> notation a set operation
Not really. Counterexamples:
1~Plus~2
3
1~ArcTan~1
\[Pi]/4
It's just notation... not set notation.
> The other thing which was puzzling me was the statement of "Mathematica
> obviously uses the most efficient way to perform the Union operation,
> which is to sort the entire set of elements..."
Another poster said that (and I may have speculated the same), but I don't
think Mathematica documentation says that.
> DeleteDuplicates would have been a good candidate for the Union
> Implementation...
But Union sorts, and DeleteDuplicates does not. They're optimized for
separate tasks, and I doubt either one uses the other in general.
Bobby
On Fri, 16 Dec 2011 14:04:32 -0600, Ulrich Arndt
<ulrich.arndt at data2knowledge.de> wrote:
> My statement might have been a bit misleading.
> I didn't expected that mathematica is offering a pure mathematica set
> concept implementation and I saw also some ways how it can be build in
> mathematica.
> But Union (Esc\[ThinSpace]un\[ThinSpace]Esc:) is indicating by its
> notation a set operation and there I was wondering why the function is
> overloaded with the additional sort.
> The other thing which was puzzling me was the statement of "Mathematica
> obviously uses the most efficient way to perform the Union operation,
> which is to sort the entire set of elements..." which is performance
> wise not true for all sets as shown in the example. As I learned today
> DeleteDuplicates would have been a good candidate for the Union
> Implementation...
>
> Ulrich
>
> Am 16.12.2011 um 19:17 schrieb DrMajorBob:
>
>> Union COULD be a set operation... if sets existed in Mathematica. The
>> output may LOOK like a set to you, but that's a mirage.
>>
>>> But there are other implementations which can be used for this set
>>> operation which do not use sorting
>>
>> Ah, but you DID sort b (in order to make the results equal), whereas
>> Union sorted the list before eliminating duplicates.
>>
>> Yes, other implementations exist that don't involve sorting... but
>> they're not built-in to Mathematica, and you haven't demonstrated one.
>>
>> Bobby
>>
>> On Fri, 16 Dec 2011 04:54:15 -0600, Ulrich Arndt
>> <ulrich.arndt at data2knowledge.de> wrote:
>>
>>> Union and Sort seems not naturally connected to me. Union is a set
>>> operation and according to the documentation it seems that the
>>> implementation is using some sorting...
>>> But there are other implementations which can be used for this set
>>> operation which do not use sorting and seems to have some better
>>> performance in some cases - will clearly depend on the underlying data!
>>>
>>> In[169]: dim = 1000000;
>>> list = RandomChoice[Range[299], dim];
>>> (a = Union[list]); // Timing
>>> (b = Tally[list][[All, 1]]); // Timing
>>> a == Sort[b]
>>>
>>> Out[171]= {0.222803, Null}
>>>
>>> Out[172]= {0.003932, Null}
>>>
>>> Out[173]= True
>>>
>>> Ulrich
>>>
>>> Am 15.12.2011 um 10:55 schrieb David Bailey:
>>>
>>>> On 14/12/2011 11:10, Michael Stern wrote:
>>>>> I have noticed that if combine two lists of date/data objects using
>>>>> Union[], the result is sorted by date.
>>>>>
>>>>> For example
>>>>>
>>>>> In[]:= mins={{{2001, 1, 31}, 0.993268}, {{2002, 3, 31}, 1.01395},
>>>>> {{2003, 6, 30},
>>>>> 1.08647}, {{2005, 11, 30}, 1.14752}, {{2006, 9, 30},
>>>>> 1.18938}, {{2007, 2, 28}, 1.19658}, {{2008, 1, 31},
>>>>> 1.20432}, {{2011, 1, 31}, 1.37501}};
>>>>> In[]:= maxs={{{2000, 10, 31}, 1.01816}, {{2001, 12, 31}, =
>>> 1.02714},
>>>>> {{2004, 2, 29},
>>>>> 1.12702}, {{2005, 3, 31}, 1.16986}, {{2010, 10, 31}, 1.39026}};
>>>>>
>>>>> In[]:= Union[maxs, mins]
>>>>>
>>>>> Out[]= {{{2000, 10, 31}, 1.01816}, {{2001, 1, 31},
>>>>> 0.993268}, {{2001, 12, 31}, 1.02714}, {{2002, 3, 31},
>>>>> 1.01395}, {{2003, 6, 30}, 1.08647}, {{2004, 2, 29},
>>>>> 1.12702}, {{2005, 3, 31}, 1.16986}, {{2005, 11, 30},
>>>>> 1.14752}, {{2006, 9, 30}, 1.18938}, {{2007, 2, 28},
>>>>> 1.19658}, {{2008, 1, 31}, 1.20432}, {{2010, 10, 31},
>>>>> 1.39026}, {{2011, 1, 31}, 1.37501}}
>>>>>
>>>>> Is this reliable behavior? Is there any case in which Union would
>>>>> /not/
>>>>> return results that have been properly sorted by date?
>>>>>
>>>>> Thank you,
>>>>>
>>>>> -Michael Stern
>>>> Sorting is part of the operation, and you can rely on the result being
>>>> sorted.
>>>>
>>>> Mathematica obviously uses the most efficient way to perform the Union
>>>> operation, which is to sort the entire set of elements, and then
>>>> perform
>>>> a scan through the elements, discarding adjacent duplicates.
>>>>
>>>> David Bailey
>>>> http://www.dbaileyconsultancy.co.uk
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> DrMajorBob at yahoo.com
>
>
>
>
--
DrMajorBob at yahoo.com
- References:
- Re: Does Union[] reliably sort?
- From: David Bailey <dave@removedbailey.co.uk>
- Re: Does Union[] reliably sort?