Re: Question about Union
- To: mathgroup at smc.vnet.net
- Subject: [mg79264] Re: [mg79191] Question about Union
- From: Carl Woll <carlw at wolfram.com>
- Date: Sun, 22 Jul 2007 04:27:06 -0400 (EDT)
- References: <200707200736.DAA26258@smc.vnet.net>
Yaroslav Bulatov wrote:
>Union treats 2 numbers as distinct even though SameQ returns true for
>them...how does it test whether values are distinct by default?
>
>Union[{0.6342350954011774, 0.6342350954011775}]
>0.6342350954011774===0.6342350954011775
>
>
>
The default test is to check whether the two elements are canonically
the same, i.e. Order[##]==0&. However, this isn't the whole story. The
algorithm used by Union is different depending on whether the test used
is the default or user-supplied:
1. If no test function is supplied, then Union sorts the input in
canonical order, splits the sorted input into runs of identical
elements, then takes the first element of each set of identical
elements. This algorithm is O(n log n) and is equivalent to:
Split[ Sort[data], SameTest->(Order[##]==0&) ][[All, 1]]
Note that:
In[297]:= Order[0.6342350954011774, 0.6342350954011775]
Out[297]= 1
so they are not (canonically) identical.
2. If a test function is supplied, then Union does a pairwise test
between every pair of elements. This algorithm is O(n^2). The default
algorithm cannot be used when a test function is supplied, as it is
possible that two "identical" elements might be sorted in such a way
that they are not adjacent.
Carl Woll
Wolfram Research
- References:
- Question about Union
- From: Yaroslav Bulatov <yaroslavvb@gmail.com>
- Question about Union