MathGroup Archive 2005

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: SameTest in Union

  • To: mathgroup at
  • Subject: [mg61147] Re: SameTest in Union
  • From: Bill Rowe <readnewsciv at>
  • Date: Tue, 11 Oct 2005 03:22:07 -0400 (EDT)
  • Sender: owner-wri-mathgroup at

On 10/10/05 at 2:40 AM, jackgoldberg at (Jack Goldberg)

>I have a list, something like this:

>lst = {1.1101, 1.11095, 1.11076, 1.09, 2.3523, 2.352. 2.35211}

>I want to remove from the list those entries which are near each
>other but not identical, leaving only one representative for each
>of these numbers.  One approach is to use  Union with the option
>SameTest->???.   Here the same test might be that the difference
>between entries is less than, say 10^(-2).   But I can't seem to
>get SameTest to work.  So, what I want is

>Union[ lst, SameTest- > ?]

>so that  the union returns

>{1.1101,  2.35211}

>Here, I chose 2 representatives.  Any other choice is OK;   say,

>{1.11095,  2.352}

>is also satisfactory.

>There may be other ways to do this, but I thought of  Union  first.
>Perhaps, Cases  or Select  might be better.  Any help is

I would use Split to group the list and Median to select the representative value as follows:

Median /@ Split[lst, Abs[#1 - #2] < 0.01 & ]
{1.11076, 1.09, 2.35211}

Note this gives three members since the difference between 1.09 and the nearest other values is greater than .01. But if you consider the difference between 1.11076 and 1.09 to be insignificant, you could increase the tolerance to say 0.05 to get:

Median /@ Split[lst, Abs[#1 - #2] < 0.05 & ]
{1.11043, 2.35211}

Note, here the first value is the midpoint of the two central values and is not contained in the original list. If it is essential to only return values in the original list, Median could be replaced with say First, i.e.,

First /@ Split[lst, Abs[#1 - #2] < 0.05 & ]
{1.1101, 2.3523}

My preference would be for Median since that has some statistical significance. For a partially sorted list (elements with similar values grouped together), First outputs essentially a random value to represent the group. For a sorted list, First would output the minimum value of each group which likely introduces a bias in any additional computations.

One other comment. Since the list you started with is clearly partially sorted, I did not bother to sort the list. But in general for this method to work, the list should be sorted.
To reply via email subtract one hundred and four

  • Prev by Date: Re: NET/Link return array from C++
  • Next by Date: Re:
  • Previous by thread: Re: SameTest in Union
  • Next by thread: Re: SameTest in Union