Re: reliably sort?
- To: mathgroup at smc.vnet.net
- Subject: [mg123754] Re: reliably sort?
- From: DrMajorBob <btreat1 at austin.rr.com>
- Date: Sat, 17 Dec 2011 02:47:34 -0500 (EST)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- References: <jca06e$a35$1@smc.vnet.net> <201112150955.EAA22951@smc.vnet.net>
- Reply-to: drmajorbob at yahoo.com
> But Union (Esc\[ThinSpace]un\[ThinSpace]Esc:) is indicating by its > notation a set operation Not really. Counterexamples: 1~Plus~2 3 1~ArcTan~1 \[Pi]/4 It's just notation... not set notation. > The other thing which was puzzling me was the statement of "Mathematica > obviously uses the most efficient way to perform the Union operation, > which is to sort the entire set of elements..." Another poster said that (and I may have speculated the same), but I don't think Mathematica documentation says that. > DeleteDuplicates would have been a good candidate for the Union > Implementation... But Union sorts, and DeleteDuplicates does not. They're optimized for separate tasks, and I doubt either one uses the other in general. Bobby On Fri, 16 Dec 2011 14:04:32 -0600, Ulrich Arndt <ulrich.arndt at data2knowledge.de> wrote: > My statement might have been a bit misleading. > I didn't expected that mathematica is offering a pure mathematica set > concept implementation and I saw also some ways how it can be build in > mathematica. > But Union (Esc\[ThinSpace]un\[ThinSpace]Esc:) is indicating by its > notation a set operation and there I was wondering why the function is > overloaded with the additional sort. > The other thing which was puzzling me was the statement of "Mathematica > obviously uses the most efficient way to perform the Union operation, > which is to sort the entire set of elements..." which is performance > wise not true for all sets as shown in the example. As I learned today > DeleteDuplicates would have been a good candidate for the Union > Implementation... > > Ulrich > > Am 16.12.2011 um 19:17 schrieb DrMajorBob: > >> Union COULD be a set operation... if sets existed in Mathematica. The >> output may LOOK like a set to you, but that's a mirage. >> >>> But there are other implementations which can be used for this set >>> operation which do not use sorting >> >> Ah, but you DID sort b (in order to make the results equal), whereas >> Union sorted the list before eliminating duplicates. >> >> Yes, other implementations exist that don't involve sorting... but >> they're not built-in to Mathematica, and you haven't demonstrated one. >> >> Bobby >> >> On Fri, 16 Dec 2011 04:54:15 -0600, Ulrich Arndt >> <ulrich.arndt at data2knowledge.de> wrote: >> >>> Union and Sort seems not naturally connected to me. Union is a set >>> operation and according to the documentation it seems that the >>> implementation is using some sorting... >>> But there are other implementations which can be used for this set >>> operation which do not use sorting and seems to have some better >>> performance in some cases - will clearly depend on the underlying data! >>> >>> In[169]: dim = 1000000; >>> list = RandomChoice[Range[299], dim]; >>> (a = Union[list]); // Timing >>> (b = Tally[list][[All, 1]]); // Timing >>> a == Sort[b] >>> >>> Out[171]= {0.222803, Null} >>> >>> Out[172]= {0.003932, Null} >>> >>> Out[173]= True >>> >>> Ulrich >>> >>> Am 15.12.2011 um 10:55 schrieb David Bailey: >>> >>>> On 14/12/2011 11:10, Michael Stern wrote: >>>>> I have noticed that if combine two lists of date/data objects using >>>>> Union[], the result is sorted by date. >>>>> >>>>> For example >>>>> >>>>> In[]:= mins={{{2001, 1, 31}, 0.993268}, {{2002, 3, 31}, 1.01395}, >>>>> {{2003, 6, 30}, >>>>> 1.08647}, {{2005, 11, 30}, 1.14752}, {{2006, 9, 30}, >>>>> 1.18938}, {{2007, 2, 28}, 1.19658}, {{2008, 1, 31}, >>>>> 1.20432}, {{2011, 1, 31}, 1.37501}}; >>>>> In[]:= maxs={{{2000, 10, 31}, 1.01816}, {{2001, 12, 31}, = >>> 1.02714}, >>>>> {{2004, 2, 29}, >>>>> 1.12702}, {{2005, 3, 31}, 1.16986}, {{2010, 10, 31}, 1.39026}}; >>>>> >>>>> In[]:= Union[maxs, mins] >>>>> >>>>> Out[]= {{{2000, 10, 31}, 1.01816}, {{2001, 1, 31}, >>>>> 0.993268}, {{2001, 12, 31}, 1.02714}, {{2002, 3, 31}, >>>>> 1.01395}, {{2003, 6, 30}, 1.08647}, {{2004, 2, 29}, >>>>> 1.12702}, {{2005, 3, 31}, 1.16986}, {{2005, 11, 30}, >>>>> 1.14752}, {{2006, 9, 30}, 1.18938}, {{2007, 2, 28}, >>>>> 1.19658}, {{2008, 1, 31}, 1.20432}, {{2010, 10, 31}, >>>>> 1.39026}, {{2011, 1, 31}, 1.37501}} >>>>> >>>>> Is this reliable behavior? Is there any case in which Union would >>>>> /not/ >>>>> return results that have been properly sorted by date? >>>>> >>>>> Thank you, >>>>> >>>>> -Michael Stern >>>> Sorting is part of the operation, and you can rely on the result being >>>> sorted. >>>> >>>> Mathematica obviously uses the most efficient way to perform the Union >>>> operation, which is to sort the entire set of elements, and then >>>> perform >>>> a scan through the elements, discarding adjacent duplicates. >>>> >>>> David Bailey >>>> http://www.dbaileyconsultancy.co.uk >>>> >>>> >>> >>> >> >> >> -- >> DrMajorBob at yahoo.com > > > > -- DrMajorBob at yahoo.com
- References:
- Re: Does Union[] reliably sort?
- From: David Bailey <dave@removedbailey.co.uk>
- Re: Does Union[] reliably sort?