MathGroup Archive: September 2002 [00325]

[Date Index] [Thread Index] [Author Index]

RE: RE: Re: Why is my Implementation of Sorted Trees So Slow?

To: mathgroup at smc.vnet.net
Subject: [mg36680] RE: [mg36649] RE: [mg36644] Re: [mg36485] Why is my Implementation of Sorted Trees So Slow?
From: "DrBob" <drbob at bigfoot.com>
Date: Wed, 18 Sep 2002 02:10:24 -0400 (EDT)
Reply-to: <drbob at bigfoot.com>
Sender: owner-wri-mathgroup at wolfram.com

Daniel,

I'd say your algorithm (as an ALGORITHM) is precisely the same as
Husain's.  The implementation is different, but the logical steps are
the same (except perhaps for <= vs. <), and so is the logical structure
of the tree produced.  Even the storage method is pretty much the same.
Using {} instead of Null and List instead of 'node', and putting values
in the middle position, doesn't add up to a change that could affect
algorithmic complexity.

Even the fact that Flatten gives a sorted List with your implementation
is something you get "for free", and the advantage comes only when it's
time to output a sorted list.  It has nothing to do with the work of
building the tree.

Algorithmic complexity (absent hidden factors under the kernel's
control) is precisely the same for your solution and Husain's (mine too,
I think).  The speed differences we're seeing have to do with nuances of
the kernel, pattern matching costs, etc.  Our three nearly equivalent
algorithms simply incur different hidden costs.

That's the case with many of the problems that turn into "speed wars"
among us.  We compare implementation speed (including hidden costs), not
pure algorithmic complexity.  Compiled code or packed arrays often make
all the difference, and it usually happens in the background where we
can't see it.

As you've hinted below, the vagaries of Update can make a linear
algorithm quadratic, but may NOT so plague a slightly different
implementation of the very same algorithm.

So... I'm suggesting we might want to be careful when we leap to
conclusions about "algorithmic complexity" based on a few timed
experiments.

Bobby Treat

-----Original Message-----
From: danl at wolfram.com [mailto:danl at wolfram.com] 
To: mathgroup at smc.vnet.net
Subject: [mg36680] Re: [mg36649] RE: [mg36644] Re: [mg36485] Why is my
Implementation of Sorted Trees So Slow?

DrBob wrote:
> 
> This code is almost identical to yours in principle, yet 30% faster:
> 
> (compared with the code in your notebook, not the code in the post
> below)
> 
> ClearAll[emptyTree, treeInsert];
> emptyTree = {};
> 
> treeInsert[emptyTree, elem_] := {emptyTree, elem, emptyTree}
> treeInsert[tree_, elem_] /; SameQ[tree[[2]], elem] := tree
> treeInsert[tree_, elem_] /; OrderedQ[{tree[[2]], elem}] :=
>         {First[tree], tree[[2]], treeInsert[Last[tree], elem]}
> treeInsert[tree_, elem_] := {treeInsert[First[tree], elem], tree[[2]],
\
> Last[tree]}
> 
> Here's an even simpler version, same speed:
> 
> ClearAll[emptyTree, treeInsert];
> emptyTree = {};
> 
> treeInsert[emptyTree, elem_] := {emptyTree, elem, emptyTree}
> treeInsert[tree_, elem_] := Which[
>     SameQ[tree[[2]], elem], tree, OrderedQ[{tree[[2]], elem}],
>         {First@tree, tree[[2]], treeInsert[Last@tree, elem]},
>     True, {treeInsert[First@tree, elem], tree[[2]], Last@tree}]

I had tried variations on this but they all seemed ever so slightly
slower. Might be version an OS dependent.

> I see one obvious advantage of your algorithm over Husain's and my
> earlier code; though I'm not sure it explains its better behavior.
That
> is, while my algorithm made pattern matching more burdensome, yours
> virtually eliminated it.
> 
> Also, the biggest improvement I've seen comes simply from replacing
Null
> with anything else in Husain's code.  Again, that seems to suggest
that
> time spent on pattern matching is the problem.  All the algorithms do
> approximately the same real work on the tree structure.
> 
> I really like how Flatten changes your tree into a sorted list.  Very
> nice!

Goes by the name of "tree sort". Using Flatten is basically a shortcut
for walking the tree left-to-right.

> Bobby Treat

It seems I did not look hard enough at the original post or the
follow-ups. In point of fact, regarding speed, there IS no obvious
advantage of my code over that first version. The advantage is there, of
course, as timing checks indicate. But it is quite far from obvious.

The reason is related to the (rare) need for Update. It is pointed out
in the manual that "infinite" evaluation is not in fact possible (no
surprise thus far). A consequence is that Mathematica requires internal
optimization features to keep reevaluation to a minimum. Update may be
required in cases where such optimizations are overzealous.

What we have in the case of this example is, roughly, the opposite. The
code that determines need for reevaluation is simply not working
sufficiently well. It does make the correct determination, but only
after looking over the entire structure that is being evaluated. As we
have a structure growing linearly in the number of iterations, and it
gets mostly traversed each iteration, the algorithmic complexity becomes
at least O(n^2). Not a disaster unless the appropriate complexity is
smaller, which was the case for this example.

This is not a bug insofar as it is a (regretably) necessary consequence
of Mathematica evaluation semantics. The problems it can cause are all
the same quite undesirable. We've made some inroads on this problem by
enlarging substantially the class of expressions for which reevaluation
may be quickly short-circuited, with the idea of fixing all but the most
pathological of examples. The status of that work is unfortunately not
known to me at present, though I'll try to find out about it.

Daniel Lichtblau
Wolfram Research

Prev by Date: Re: RE: Re: Why is my Implementation of Sorted Trees So Slow?

Next by Date: Could someone verify a long Pi calculation in Version 4 for me?

Previous by thread: Thanks to all; here's what looks to be best solution

Next by thread: Could someone verify a long Pi calculation in Version 4 for me?