Re: Re: Re: Re: list manipulation
- To: mathgroup at smc.vnet.net
- Subject: [mg20726] Re: [mg20688] Re: [mg20614] Re: [mg20601] Re: list manipulation
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Wed, 10 Nov 1999 00:17:42 -0500
- References: <199911020735.CAA26984@smc.vnet.net.> <199911040713.CAA02648@smc.vnet.net.> <3822F4DD.4A5EA11A@gsmail01.darmstadt.dsh.de> <199911070710.CAA10914@smc.vnet.net.>
- Sender: owner-wri-mathgroup at wolfram.com
A few remarks are in order.
(1) As noted by Hartmut Wolf, sorting the events list can give a very
fast algorithm. One in essence does a linear walk simultaneously along
both the events and gaps lists. If the events list has length n and the
gaps Interval has k segments, the complexity is O(n*log(n) + k).
(2) To achieve this complexity one must avoid use of AppendTo, as noted
by Allan Hayes.
(3) One might ask whether direct use of IntervalMemberQ can be made
competitive. The answer is a qualified yes. At present IntervalMemberQ
does a linear search of its segments for membership testing, and a
binary search would be much faster. We have implemented this latter
approach in our development version.
I illustrate with an example that has been posted:
Function[{n}, gapsX[n] =
Module[{kappa = 250./n},
Table[With[{c = Random[Real, {0., 100.}],
d = kappa Exp[-Random[Real, {0., 10.}]]},
{c - d, c + d}], {n}]];
{Timing[truegapsX[n] = Interval @@ gapsX[n];][[1]],
Length[truegapsX[n]]}] /@ {100, 300, 1000, 3000, 10000}
Function[{n}, events[n] = Table[Random[Real, 100.], {n}];] /@
{100, 300, 1000, 3000, 10000};
Function[n, Timing[se[n] =
Select[events[n], ! IntervalMemberQ[
truegapsX[n], #]&];][[1]] //
(Print[#]; #) &] /@ {100, 300, 1000, 3000, 10000}
Out[4]= {0.02 Second, 0.05 Second, 0.18 Second, 0.62 Second, 2.38
Second}
Is this as fast as the Hayes/Wolf method? My machine is probably faster
than Allan Hayes' so it is hard to tell if you compare these timings
directly to those he posted; it seems that they have similar complexity.
But the real answer is that it all depends on relative sizes of events
vs gaps. The complexity of using Select with IntervalMemberQ is now
O(n*log(k)) in the example shown above, and there may also be a constant
overhead factor lost to the Hayes/Wolf method when using data consisting
of machine reals due to the fact that the List stuff will be packed
(hence very fast to sort) whereas the Interval segments will not be.
Among other things, this emphasizes the importance of making sure (by
coercion, if necessary), that you data exclusively of that type.
(4) Whilst poking into the Interval code we made some other
improvements. In particular adding and multiplying intervals consisting
of many segments will be alot faster in the next kernel release.
Daniel Lichtblau
Wolfram Research
- References:
- Re: list manipulation
- From: "jim leddon" <jleddon@home.com>
- Re: Re: list manipulation
- From: "Wolf, Hartmut" <hwolf@debis.com>
- Re: Re: Re: list manipulation
- From: "Allan Hayes" <hay@haystack.demon.co.uk>
- Re: list manipulation