MathGroup Archive: April 2000 [00381]

[Date Index] [Thread Index] [Author Index]

Re: Re: A simple programming question.

To: mathgroup at smc.vnet.net
Subject: [mg23234] Re: [mg23206] Re: [mg23174] A simple programming question.
From: Carl Woll <carlw at u.washington.edu>
Date: Tue, 25 Apr 2000 01:40:37 -0400 (EDT)
Organization: Physics Department, U of Washington
References: <200004210348.XAA19792@smc.vnet.net> <v03102800b52a0e978e35@[209.77.117.85]>
Sender: owner-wri-mathgroup at wolfram.com

Hi Russell,

I have an idea for a vunion function which should be much faster than the
basic vunion function that you started with (in your original programming
challenge). However, I'm not clear on what exactly you want the vunion
function to do. The vunion function I have in mind would declare two points to
be the same if the maximum of the difference of the coordinates were less than
some tolerance, which I'll abbreviate by tol. This doesn't seem to be the
definition you used in your vunion function, but perhaps it's close enough.
First, I'll define a vunion function which carries out the above:

tol=0.000001;

vunion[v_] := Union[v, SameTest ->
      (Max[Abs[#1 - #2]] < tol &)]

For a large set of points, this vunion function is very slow, since given a
set of N points, it goes like O(N^2). This is because the current Union
function works by comparing every possible pair of points to determine if they
are the same. My idea for speeding up this function is to first split up the
data set into a disjoint union of smaller sets where the first coordinates of
these smaller sets are basically (but not really) a distance tol apart, which
is an O(N log N) process. The not really in the previous sentence is there
because if the tolerance is large enough, then the splitting may not break up
the set into smaller pieces after all. For example, with a tolerance of .1,
the following data set will not be broken into smaller pieces:

{{.05,1,1},{.1,1,1},{.15,1,1},{.2,1,1},{.25,1,1}}

even though the first point and the last point are clearly farther than a
tolerance apart. At any rate, after splitting up into smaller subsets, apply
the vunion function to each of these smaller subsets. Hence, my modified
vunion function is

vunion1[v_] := Module[{r},
    r = Split[Sort[v], (Abs[#1[[1]] - #2[[1]]] < tol &)];
    Flatten[vunion /@ r, 1]]

In my tests, vunion1 is 20 times faster than vunion, but this was done with a
high tolerance on a random data set. It may behave very differently on your
data sets. A further speedup may be possible if we further refine the data set
into still smaller subsets based on the second coordinate. Hence, we have

vunion2[v_] := Module[{r},
    r = Split[Sort[v], (Abs[#1[[1]] - #2[[1]]] < tol &)];
    r = Flatten[
        Split[Sort[#, (#1[[2]] < #2[[2]] &)], (Abs[#1[[2]] - #2[[2]]] <
                    tol &)] & /@ r, 1];
    Flatten[vunion /@ r, 1]]

To compare the output of vunion2 to vunion or vunion1, you will need to sort
it first.

Please try these vunion functions on your test cases and let me know if they
do what you want and are competitive.

Carl Woll

Russell Towle wrote:

> Hi Carl, you wrote,
> >
> >Since I'm basically responsible for that piece of code, I'll try to give
> >you an explanation. First concentrate on the heart of the code:
> >
> >i[n_] := (i[n] = Null; n);
>
> To me this was a mysterious bit of code; your explanation helped make it
> intelligible. My instinct, when first looking at it, was to define i[n]
> outside the definition of OrderedUnion. It didn't make sense to see it
> inside a Block like that.
>
> Do you have any ideas for applying OrderedUnion to a list of n-vectors with
> machine precision components, where changes of sign afflict numbers which
> are actually zero? The only workable version David Park and I were able to
> conjure up required using Round to make a proxy for the list of vectors and
> applying OrderedUnion to the proxy.
>
> Your OrderedUnion function is part of a WRI archive of code snippets.
>
> Russell Towle
> Box 141
> Dutch Flat, CA 95714
>
> (530) 389-2872
>

References:
- A simple programming question.
  - From: Jack Goldberg <jackgold@math.lsa.umich.edu>

Prev by Date: RE: Q: open palletes in a specific on screen position.

Next by Date: Re: fastest way to do pair-sum / make pair-list

Previous by thread: Re: A simple programming question.

Next by thread: Re: A simple programming question.