Re: Re: Re: Re: list manipulation
- To: mathgroup at smc.vnet.net
- Subject: [mg20726] Re: [mg20688] Re: [mg20614] Re: [mg20601] Re: list manipulation
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Wed, 10 Nov 1999 00:17:42 -0500
- References: <199911020735.CAA26984@smc.vnet.net.> <199911040713.CAA02648@smc.vnet.net.> <3822F4DD.4A5EA11A@gsmail01.darmstadt.dsh.de> <199911070710.CAA10914@smc.vnet.net.>
- Sender: owner-wri-mathgroup at wolfram.com
A few remarks are in order. (1) As noted by Hartmut Wolf, sorting the events list can give a very fast algorithm. One in essence does a linear walk simultaneously along both the events and gaps lists. If the events list has length n and the gaps Interval has k segments, the complexity is O(n*log(n) + k). (2) To achieve this complexity one must avoid use of AppendTo, as noted by Allan Hayes. (3) One might ask whether direct use of IntervalMemberQ can be made competitive. The answer is a qualified yes. At present IntervalMemberQ does a linear search of its segments for membership testing, and a binary search would be much faster. We have implemented this latter approach in our development version. I illustrate with an example that has been posted: Function[{n}, gapsX[n] = Module[{kappa = 250./n}, Table[With[{c = Random[Real, {0., 100.}], d = kappa Exp[-Random[Real, {0., 10.}]]}, {c - d, c + d}], {n}]]; {Timing[truegapsX[n] = Interval @@ gapsX[n];][[1]], Length[truegapsX[n]]}] /@ {100, 300, 1000, 3000, 10000} Function[{n}, events[n] = Table[Random[Real, 100.], {n}];] /@ {100, 300, 1000, 3000, 10000}; Function[n, Timing[se[n] = Select[events[n], ! IntervalMemberQ[ truegapsX[n], #]&];][[1]] // (Print[#]; #) &] /@ {100, 300, 1000, 3000, 10000} Out[4]= {0.02 Second, 0.05 Second, 0.18 Second, 0.62 Second, 2.38 Second} Is this as fast as the Hayes/Wolf method? My machine is probably faster than Allan Hayes' so it is hard to tell if you compare these timings directly to those he posted; it seems that they have similar complexity. But the real answer is that it all depends on relative sizes of events vs gaps. The complexity of using Select with IntervalMemberQ is now O(n*log(k)) in the example shown above, and there may also be a constant overhead factor lost to the Hayes/Wolf method when using data consisting of machine reals due to the fact that the List stuff will be packed (hence very fast to sort) whereas the Interval segments will not be. Among other things, this emphasizes the importance of making sure (by coercion, if necessary), that you data exclusively of that type. (4) Whilst poking into the Interval code we made some other improvements. In particular adding and multiplying intervals consisting of many segments will be alot faster in the next kernel release. Daniel Lichtblau Wolfram Research
- References:
- Re: list manipulation
- From: "jim leddon" <jleddon@home.com>
- Re: Re: list manipulation
- From: "Wolf, Hartmut" <hwolf@debis.com>
- Re: Re: Re: list manipulation
- From: "Allan Hayes" <hay@haystack.demon.co.uk>
- Re: list manipulation