MathGroup Archive 2002

[Date Index] [Thread Index] [Author Index]

Search the Archive

RE: RE: Why is my Implementation of Sorted Trees So Slow?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg36516] RE: [mg36502] RE: [mg36485] Why is my Implementation of Sorted Trees So Slow?
  • From: "DrBob" <drbob at bigfoot.com>
  • Date: Tue, 10 Sep 2002 06:24:49 -0400 (EDT)
  • Reply-to: <drbob at bigfoot.com>
  • Sender: owner-wri-mathgroup at wolfram.com

On my machine, replacing Null with "" made Husain's code 8 times faster
on the same test: 0.592 seconds versus 4.688.

I suppose making the same change in my code makes no difference because
there are no Nulls in the tree (except one, very briefly), or perhaps
because I have fewer patterns that involve Nulls.

By rearranging the order of my rules, and deleting a couple that were
never used, I made my algorithm about 15% faster, getting the time down
to 0.906.  Mihajlo's amazing discovery beats that, of course -- it's 50%
faster.

I suppose what we've found is that (1) my algorith makes the code four
to five times faster by reducing expression complexity, tree depth, and
memory use, but it's a slower algorithm otherwise (too many rules?); and
(2) Mathematica pattern matching and code execution has a real problem
with Null used as if it means something.

Bobby

-----Original Message-----
From: Mihajlo Vanevic [mailto:mvane at EUnet.yu] 
To: mathgroup at smc.vnet.net
Subject: [mg36516] Re: [mg36502] RE: [mg36485] Why is my Implementation of Sorted
Trees So Slow?



Check this out:

Changing Null to "NULL" (or anything else) speeds up
the Hussain's code!!

Clear[add];
add["NULL", x_Real] := node[x, "NULL", "NULL"];
add[node[x_Real, lower_, higher_], y_Real] /; x > y := 
    node[x, add[lower, y], higher];
add[node[x_Real, lower_, higher_], y_Real] := node[x, lower, add[higher,
y]];
nums = "NULL";
SeedRandom[5];
Do[nums = add[nums, Random[]], {5000}] // Timing

Out
{5.11 Second, Null}


Clear[add];
add[Null, x_Real] := node[x, Null, Null];
add[node[x_Real, lower_, higher_], y_Real] /; x > y := 
    node[x, add[lower, y], higher];
add[node[x_Real, lower_, higher_], y_Real] := node[x, lower, add[higher,
y]];
nums = Null;
SeedRandom[5];
Do[nums = add[nums, Random[]], {5000}] // Timing

Out
{41.03 Second, Null}


Also:
Simplify[1==Null]
1==Null
doesn't evaluate to false!!



Btw. changing Null to "NULL" in your code doesn't infulence the speed...


Regards,				 
         Mihajlo Vanevic
         mvane at EUnet.yu
         2002-09-09

**************************************************************
*    At 2002-09-09, 00:29:00 
*        DrBob, drbob at bigfoot.com  wrote:
**************************************************************
>You're not doing anything dumb as far as I can see, but you're using
far
>more memory than necessary.  Here are statistics for your algorithm on
>my machine:
>
>SeedRandom[5];
>nums = Null
>Do[nums = add[nums, Random[]], {5000}]; // Timing
>nums // LeafCount
>nums // ByteCount
>Count[nums, Null, Infinity]
>Count[nums, node[_, Null, _], Infinity]
>Count[nums, node[_, _, Null], Infinity]
>Count[nums, node[_, Null, Null], Infinity]
>Count[nums, node[___], Infinity]
>Count[nums, node[_, _node | _?NumericQ, _node | _?NumericQ], Infinity]
>Depth[nums]
>
>{4.406000000000001*Second,   Null}
>15001
>240000
>5001 (* Null values in the tree *)
>2458 (* nodes without left offspring *)
>2543 (* nodes without right offspring *)
>1669 (* leaf nodes *)
>4999 (* total nodes *)
>1667 (* nodes with both left and right offspring *)
>28   (* minus one, makes 27 levels in the tree *)
>
>It's clearly not ideal to store 5001 Nulls for 5000 actual values!
>
>Here's a method that doesn't change the algorithm much, but changes the
>storage method a great deal:
>
>First of all, I'm lazy, so I'll redefine <,>,<=,>=, etc. to make the
>code simpler:
>
>Unprotect[Less, Greater, LessEqual, GreaterEqual];
>Less[a : _node ..] := Less @@ (First /@ {a})
>LessEqual[a : _node ..] := LessEqual @@ (First /@ {a})
>Greater[a : _node ..] := Greater @@ (First /@ {a})
>GreaterEqual[a : _node ..] := GreaterEqual @@ (First /@ {a})
>Protect[Less, Greater, LessEqual, GreaterEqual];
>
>Again because I'm lazy, I'll define "node" so that I don't have to
>consciously avoid leaf nodes and unnecessary nesting:
>
>ClearAll[node]
>node[a___, node[b_], c___] = node[a, b, c];
>node[node[a__]] = node[a];
>
>Here's my "add" function:
>
>Clear[add]
>add[Null, x_] = node[x];
>add[x_, (y_)?NumericQ] := add[node[x],   node[y]]
>add[(x_)?NumericQ, y_] := add[node[x],   node[y]]
>add[node[x_], node[y_]] :=  If[x > y, node[x, y], node[y, x]] (* <--
>THERE! *)
>add[node[x_, lower_], y_node] :=
>  Which[y >= node[x], node[x, lower, y],
>   y <= node[lower], node[lower, y, x], True,   node[y, lower, x]]
>add[node[x_, lower_, higher_], y_node] /;
>   node[y] <= node[x] :=  node[x, add[lower, y], higher]
>add[node[x_, lower_, higher_], y_node] :=  node[x, lower, add[higher,
>y]]
>
>Where I have a pointer is the code that prevents adding right-offspring
>to a leaf node.  That avoids nodes like your node[a,Null,c].  When you
>would have node[a,b,Null], I have node[a,b].  When you would have
>node[a,Null,Null], I use a itself.
>
>Timing and space requirements are much better now:
>
>SeedRandom[5];
>nums = Null
>Timing[Do[nums = add[nums, Random[]], {5000}]; ]
>
>{2.109*Second, Null}
>
>LeafCount[nums]
>ByteCount[nums]
>Count[nums, Null, Infinity]
>Count[nums, node[_], Infinity]
>Count[nums, node[_, _],  Infinity]
>Count[nums, node[_, _, _],  Infinity]
>Count[nums, node[___], Infinity]
>Depth[nums]
>
>7852   (* 48 fewer "leaf" expressions,
>          mostly due to Nulls and right-offspring eliminated *)
>165624 (* 31 less storage *)
>0    (* no Nulls, versus 5001 *)
>0    (* no leaf nodes, versus 1669 *)
>     (* no nodes without left-offspring, versus 2458 *)
>705  (* nodes without right offspring, versus 2543 *)
>2146 (* nodes with both left and right offspring, versus 1667 *)
>2851 (* total nodes *)
>21   (* 21 tree levels versus 27 *)
>
>The logical structure is identical to yours except that there are no
>nodes with only right-offspring.  This incidentally tends to reduce
tree
>depth.
>
>The storage format is far different: there are no trivial (childless)
>nodes, and nodes with left-offspring but no right-offspring are stored
>without a Null on the right.
>
>I could make this quite a bit faster, I think, if I spent time on the
>code to eliminate changes to Less, Greater, etc. and the frequent
>nesting followed by unnesting that goes on in the algorithm.  It might
>take twice as much code, however, so I like it as is.
>
>You're probably better off storing these things in a heap, of course.
>
>Or -- better yet -- use the built-in Sort and Ordering functions.
>
>Bobby Treat
>
>-----Original Message-----
>From: Husain Ali Al-Mohssen [mailto:husain at MIT.EDU]
To: mathgroup at smc.vnet.net
>Subject: [mg36516] [mg36502] [mg36485] Why is my Implementation of Sorted Trees
So Slow?
>
>
>Hi all,
>
>I am trying to implement a very simple sorted tree to quickly store
some
>real numbers I need. I have written an add, delete, minimum, and pop
>(delete the lowest value) function and they seem to work ok but are
very
>slow. Let's just look @ my implementation of the add part:
>nums=Null;(*my initial blank Tree)
>In[326]:=
>Clear[add]
>
>In[327]:=
>add[Null,x_Real]:=node[x,Null,Null]
>
>add[Null,node[x_Real,lower_,higher_]]=node[x,lower,higher]
>
>In[328]:=
>add[node[x_Real,lower_,higher_],y_Real]:=
>  If[x>y,node[x,add[lower,y],higher],node[x,lower,add[higher,y]]]
>
>In[288]:=
>add[node[x_Real,lowerx_,higherx_],node[y_Real,lowery_,highery_]]:=If[x>
y
>,
>    node[x,add[lowerx,node[y,lowery,highery]],higherx],
>    node[x,lowerx,add[higherx,node[y,lowery,highery]]]
>    ]
>
>
>
>Now this is my attempt to test how fast my add works:
>
>SeedRandom[5];
>Do[nums=add[nums,Random[]],{5000}];//Timing
>
>Out[333]=
>{13.279 Second,Null}
>
>(running on Mathematica 4.1 on a win2k VMWare machine running on Linux
>RH7.3
>running on an 1.4GHz Athlon with 1GB of ram).
>
>Questions:
>1. Is this as fast as I can get my code to run?
>2. Am I doing something obviously stupid?
>3. would Compiling things help?
>
>
>Thanks,
>Husain
**************************************************************









  • Prev by Date: RE: RE: RE: Generating Two Unit Orthogonal Vectors
  • Next by Date: A symbol for Floor
  • Previous by thread: RE: Why is my Implementation of Sorted Trees So Slow?
  • Next by thread: Re: RE: Why is my Implementation of Sorted Trees So Slow?