MathGroup Archive 1999

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: Re: Re: list manipulation

A few remarks are in order.

(1) As noted by Hartmut Wolf, sorting the events list can give a very
fast algorithm. One in essence does a linear walk simultaneously along
both the events and gaps lists. If the events list has length n and the
gaps Interval has k segments, the complexity is O(n*log(n) + k).

(2) To achieve this complexity one must avoid use of AppendTo, as noted
by Allan Hayes.

(3) One might ask whether direct use of IntervalMemberQ can be made
competitive. The answer is a qualified yes. At present IntervalMemberQ
does a linear search of its segments for membership testing, and a
binary search would be much faster. We have implemented this latter
approach in our development version.

I illustrate with an example that has been posted:

Function[{n}, gapsX[n] =
  Module[{kappa = 250./n},
    Table[With[{c = Random[Real, {0., 100.}],
    d = kappa Exp[-Random[Real, {0., 10.}]]},
      {c - d, c + d}], {n}]];
{Timing[truegapsX[n] = Interval @@ gapsX[n];][[1]],
  Length[truegapsX[n]]}] /@ {100, 300, 1000, 3000, 10000}

Function[{n}, events[n] = Table[Random[Real, 100.], {n}];] /@
  {100, 300, 1000, 3000, 10000};

Function[n, Timing[se[n] = 
  Select[events[n], ! IntervalMemberQ[
    truegapsX[n], #]&];][[1]] //
  (Print[#]; #) &] /@ {100, 300, 1000, 3000, 10000}

Out[4]= {0.02 Second, 0.05 Second, 0.18 Second, 0.62 Second, 2.38

Is this as fast as the Hayes/Wolf method? My machine is probably faster
than Allan Hayes' so it is hard to tell if you compare these timings
directly to those he posted; it seems that they have similar complexity.
But the real answer is that it all depends on relative sizes of events
vs gaps. The complexity of using Select with IntervalMemberQ is now
O(n*log(k)) in the example shown above, and there may also be a constant
overhead factor lost to the Hayes/Wolf method when using data consisting
of machine reals due to the fact that the List stuff will be packed
(hence very fast to sort) whereas the Interval segments will not be.
Among other things, this emphasizes the importance of making sure (by
coercion, if necessary), that you data exclusively of that type.

(4) Whilst poking into the Interval code we made some other
improvements. In particular adding and multiplying intervals consisting
of many segments will be alot faster in the next kernel release.

Daniel Lichtblau
Wolfram Research

  • Prev by Date: Re: Re: Combinatorica questions!!!
  • Next by Date: Re: complex variable
  • Previous by thread: Re: Re: Re: list manipulation
  • Next by thread: Re: list manipulation