Re: Re: A simple programming question.

*To*: mathgroup at smc.vnet.net*Subject*: [mg23234] Re: [mg23206] Re: [mg23174] A simple programming question.*From*: Carl Woll <carlw at u.washington.edu>*Date*: Tue, 25 Apr 2000 01:40:37 -0400 (EDT)*Organization*: Physics Department, U of Washington*References*: <200004210348.XAA19792@smc.vnet.net> <v03102800b52a0e978e35@[209.77.117.85]>*Sender*: owner-wri-mathgroup at wolfram.com

Hi Russell, I have an idea for a vunion function which should be much faster than the basic vunion function that you started with (in your original programming challenge). However, I'm not clear on what exactly you want the vunion function to do. The vunion function I have in mind would declare two points to be the same if the maximum of the difference of the coordinates were less than some tolerance, which I'll abbreviate by tol. This doesn't seem to be the definition you used in your vunion function, but perhaps it's close enough. First, I'll define a vunion function which carries out the above: tol=0.000001; vunion[v_] := Union[v, SameTest -> (Max[Abs[#1 - #2]] < tol &)] For a large set of points, this vunion function is very slow, since given a set of N points, it goes like O(N^2). This is because the current Union function works by comparing every possible pair of points to determine if they are the same. My idea for speeding up this function is to first split up the data set into a disjoint union of smaller sets where the first coordinates of these smaller sets are basically (but not really) a distance tol apart, which is an O(N log N) process. The not really in the previous sentence is there because if the tolerance is large enough, then the splitting may not break up the set into smaller pieces after all. For example, with a tolerance of .1, the following data set will not be broken into smaller pieces: {{.05,1,1},{.1,1,1},{.15,1,1},{.2,1,1},{.25,1,1}} even though the first point and the last point are clearly farther than a tolerance apart. At any rate, after splitting up into smaller subsets, apply the vunion function to each of these smaller subsets. Hence, my modified vunion function is vunion1[v_] := Module[{r}, r = Split[Sort[v], (Abs[#1[[1]] - #2[[1]]] < tol &)]; Flatten[vunion /@ r, 1]] In my tests, vunion1 is 20 times faster than vunion, but this was done with a high tolerance on a random data set. It may behave very differently on your data sets. A further speedup may be possible if we further refine the data set into still smaller subsets based on the second coordinate. Hence, we have vunion2[v_] := Module[{r}, r = Split[Sort[v], (Abs[#1[[1]] - #2[[1]]] < tol &)]; r = Flatten[ Split[Sort[#, (#1[[2]] < #2[[2]] &)], (Abs[#1[[2]] - #2[[2]]] < tol &)] & /@ r, 1]; Flatten[vunion /@ r, 1]] To compare the output of vunion2 to vunion or vunion1, you will need to sort it first. Please try these vunion functions on your test cases and let me know if they do what you want and are competitive. Carl Woll Russell Towle wrote: > Hi Carl, you wrote, > > > >Since I'm basically responsible for that piece of code, I'll try to give > >you an explanation. First concentrate on the heart of the code: > > > >i[n_] := (i[n] = Null; n); > > To me this was a mysterious bit of code; your explanation helped make it > intelligible. My instinct, when first looking at it, was to define i[n] > outside the definition of OrderedUnion. It didn't make sense to see it > inside a Block like that. > > Do you have any ideas for applying OrderedUnion to a list of n-vectors with > machine precision components, where changes of sign afflict numbers which > are actually zero? The only workable version David Park and I were able to > conjure up required using Round to make a proxy for the list of vectors and > applying OrderedUnion to the proxy. > > Your OrderedUnion function is part of a WRI archive of code snippets. > > Russell Towle > Box 141 > Dutch Flat, CA 95714 > > (530) 389-2872 >

**References**:**A simple programming question.***From:*Jack Goldberg <jackgold@math.lsa.umich.edu>