MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: reliably sort?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg123754] Re: reliably sort?
  • From: DrMajorBob <btreat1 at austin.rr.com>
  • Date: Sat, 17 Dec 2011 02:47:34 -0500 (EST)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com
  • References: <jca06e$a35$1@smc.vnet.net> <201112150955.EAA22951@smc.vnet.net>
  • Reply-to: drmajorbob at yahoo.com

> But Union (Esc\[ThinSpace]un\[ThinSpace]Esc:) is indicating by its  
> notation a set operation

Not really. Counterexamples:

1~Plus~2

3

1~ArcTan~1

\[Pi]/4

It's just notation... not set notation.

> The other thing which was puzzling me was the statement of "Mathematica  
> obviously uses the most efficient way to perform the Union operation,  
> which is to sort the entire set of elements..."

Another poster said that (and I may have speculated the same), but I don't  
think Mathematica documentation says that.

> DeleteDuplicates would have been a good candidate for the Union  
> Implementation...

But Union sorts, and DeleteDuplicates does not. They're optimized for  
separate tasks, and I doubt either one uses the other in general.

Bobby

On Fri, 16 Dec 2011 14:04:32 -0600, Ulrich Arndt  
<ulrich.arndt at data2knowledge.de> wrote:

> My statement might have been a bit misleading.
> I didn't expected that mathematica is offering a pure mathematica set  
> concept implementation and I saw also some ways how it can be build in  
> mathematica.
> But Union (Esc\[ThinSpace]un\[ThinSpace]Esc:) is indicating by its  
> notation a set operation and there I was wondering why the function is  
> overloaded with the additional sort.
> The other thing which was puzzling me was the statement of "Mathematica  
> obviously uses the most efficient way to perform the Union operation,  
> which is to sort the entire set of elements..." which is performance  
> wise not true for all sets as shown in the example. As I learned today  
> DeleteDuplicates would have been a good candidate for the Union  
> Implementation...
>
> Ulrich
>
> Am 16.12.2011 um 19:17 schrieb DrMajorBob:
>
>> Union COULD be a set operation... if sets existed in Mathematica. The  
>> output may LOOK like a set to you, but that's a mirage.
>>
>>> But there are other implementations which can be used for this set
>>> operation which do not use sorting
>>
>> Ah, but you DID sort b (in order to make the results equal), whereas  
>> Union sorted the list before eliminating duplicates.
>>
>> Yes, other implementations exist that don't involve sorting... but  
>> they're not built-in to Mathematica, and you haven't demonstrated one.
>>
>> Bobby
>>
>> On Fri, 16 Dec 2011 04:54:15 -0600, Ulrich Arndt  
>> <ulrich.arndt at data2knowledge.de> wrote:
>>
>>> Union and Sort seems not naturally connected to me. Union is a set
>>> operation and according to the documentation it seems that the
>>> implementation is using some sorting...
>>> But there are other implementations which can be used for this set
>>> operation which do not use sorting and seems to have some better
>>> performance in some cases - will clearly depend on the underlying data!
>>>
>>> In[169]: dim = 1000000;
>>> list = RandomChoice[Range[299], dim];
>>> (a = Union[list]); // Timing
>>> (b = Tally[list][[All, 1]]); // Timing
>>> a == Sort[b]
>>>
>>> Out[171]= {0.222803, Null}
>>>
>>> Out[172]= {0.003932, Null}
>>>
>>> Out[173]= True
>>>
>>> Ulrich
>>>
>>> Am 15.12.2011 um 10:55 schrieb David Bailey:
>>>
>>>> On 14/12/2011 11:10, Michael Stern wrote:
>>>>> I have noticed that if combine two lists of date/data objects using
>>>>> Union[], the result is sorted by date.
>>>>>
>>>>> For example
>>>>>
>>>>> In[]:= mins={{{2001, 1, 31}, 0.993268}, {{2002, 3, 31}, 1.01395},
>>>>> {{2003, 6, 30},
>>>>>   1.08647}, {{2005, 11, 30}, 1.14752}, {{2006, 9, 30},
>>>>>   1.18938}, {{2007, 2, 28}, 1.19658}, {{2008, 1, 31},
>>>>>   1.20432}, {{2011, 1, 31}, 1.37501}};
>>>>> In[]:= maxs={{{2000, 10, 31}, 1.01816}, {{2001, 12, 31}, =
>>> 1.02714},
>>>>> {{2004, 2, 29},
>>>>>    1.12702}, {{2005, 3, 31}, 1.16986}, {{2010, 10, 31}, 1.39026}};
>>>>>
>>>>> In[]:= Union[maxs, mins]
>>>>>
>>>>> Out[]= {{{2000, 10, 31}, 1.01816}, {{2001, 1, 31},
>>>>>   0.993268}, {{2001, 12, 31}, 1.02714}, {{2002, 3, 31},
>>>>>   1.01395}, {{2003, 6, 30}, 1.08647}, {{2004, 2, 29},
>>>>>   1.12702}, {{2005, 3, 31}, 1.16986}, {{2005, 11, 30},
>>>>>   1.14752}, {{2006, 9, 30}, 1.18938}, {{2007, 2, 28},
>>>>>   1.19658}, {{2008, 1, 31}, 1.20432}, {{2010, 10, 31},
>>>>>   1.39026}, {{2011, 1, 31}, 1.37501}}
>>>>>
>>>>> Is this reliable behavior? Is there any case in which Union would  
>>>>> /not/
>>>>> return results that have been properly sorted by date?
>>>>>
>>>>> Thank you,
>>>>>
>>>>> -Michael Stern
>>>> Sorting is part of the operation, and you can rely on the result being
>>>> sorted.
>>>>
>>>> Mathematica obviously uses the most efficient way to perform the Union
>>>> operation, which is to sort the entire set of elements, and then  
>>>> perform
>>>> a scan through the elements, discarding adjacent duplicates.
>>>>
>>>> David Bailey
>>>> http://www.dbaileyconsultancy.co.uk
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> DrMajorBob at yahoo.com
>
>
>
>


-- 
DrMajorBob at yahoo.com



  • Prev by Date: Re: NMinimize problem: fct minimized uses FindRoot
  • Next by Date: Re: reliably sort?
  • Previous by thread: Re: reliably sort?
  • Next by thread: Re: Does Union[] reliably sort?