|
[Date Index]
[Thread Index]
[Author Index]
RE: RE: Why is my Implementation of Sorted Trees So Slow?
- To: mathgroup at smc.vnet.net
- Subject: [mg36516] RE: [mg36502] RE: [mg36485] Why is my Implementation of Sorted Trees So Slow?
- From: "DrBob" <drbob at bigfoot.com>
- Date: Tue, 10 Sep 2002 06:24:49 -0400 (EDT)
- Reply-to: <drbob at bigfoot.com>
- Sender: owner-wri-mathgroup at wolfram.com
On my machine, replacing Null with "" made Husain's code 8 times faster
on the same test: 0.592 seconds versus 4.688.
I suppose making the same change in my code makes no difference because
there are no Nulls in the tree (except one, very briefly), or perhaps
because I have fewer patterns that involve Nulls.
By rearranging the order of my rules, and deleting a couple that were
never used, I made my algorithm about 15% faster, getting the time down
to 0.906. Mihajlo's amazing discovery beats that, of course -- it's 50%
faster.
I suppose what we've found is that (1) my algorith makes the code four
to five times faster by reducing expression complexity, tree depth, and
memory use, but it's a slower algorithm otherwise (too many rules?); and
(2) Mathematica pattern matching and code execution has a real problem
with Null used as if it means something.
Bobby
-----Original Message-----
From: Mihajlo Vanevic [mailto:mvane at EUnet.yu]
To: mathgroup at smc.vnet.net
Subject: [mg36516] Re: [mg36502] RE: [mg36485] Why is my Implementation of Sorted
Trees So Slow?
Check this out:
Changing Null to "NULL" (or anything else) speeds up
the Hussain's code!!
Clear[add];
add["NULL", x_Real] := node[x, "NULL", "NULL"];
add[node[x_Real, lower_, higher_], y_Real] /; x > y :=
node[x, add[lower, y], higher];
add[node[x_Real, lower_, higher_], y_Real] := node[x, lower, add[higher,
y]];
nums = "NULL";
SeedRandom[5];
Do[nums = add[nums, Random[]], {5000}] // Timing
Out
{5.11 Second, Null}
Clear[add];
add[Null, x_Real] := node[x, Null, Null];
add[node[x_Real, lower_, higher_], y_Real] /; x > y :=
node[x, add[lower, y], higher];
add[node[x_Real, lower_, higher_], y_Real] := node[x, lower, add[higher,
y]];
nums = Null;
SeedRandom[5];
Do[nums = add[nums, Random[]], {5000}] // Timing
Out
{41.03 Second, Null}
Also:
Simplify[1==Null]
1==Null
doesn't evaluate to false!!
Btw. changing Null to "NULL" in your code doesn't infulence the speed...
Regards,
Mihajlo Vanevic
mvane at EUnet.yu
2002-09-09
**************************************************************
* At 2002-09-09, 00:29:00
* DrBob, drbob at bigfoot.com wrote:
**************************************************************
>You're not doing anything dumb as far as I can see, but you're using
far
>more memory than necessary. Here are statistics for your algorithm on
>my machine:
>
>SeedRandom[5];
>nums = Null
>Do[nums = add[nums, Random[]], {5000}]; // Timing
>nums // LeafCount
>nums // ByteCount
>Count[nums, Null, Infinity]
>Count[nums, node[_, Null, _], Infinity]
>Count[nums, node[_, _, Null], Infinity]
>Count[nums, node[_, Null, Null], Infinity]
>Count[nums, node[___], Infinity]
>Count[nums, node[_, _node | _?NumericQ, _node | _?NumericQ], Infinity]
>Depth[nums]
>
>{4.406000000000001*Second, Null}
>15001
>240000
>5001 (* Null values in the tree *)
>2458 (* nodes without left offspring *)
>2543 (* nodes without right offspring *)
>1669 (* leaf nodes *)
>4999 (* total nodes *)
>1667 (* nodes with both left and right offspring *)
>28 (* minus one, makes 27 levels in the tree *)
>
>It's clearly not ideal to store 5001 Nulls for 5000 actual values!
>
>Here's a method that doesn't change the algorithm much, but changes the
>storage method a great deal:
>
>First of all, I'm lazy, so I'll redefine <,>,<=,>=, etc. to make the
>code simpler:
>
>Unprotect[Less, Greater, LessEqual, GreaterEqual];
>Less[a : _node ..] := Less @@ (First /@ {a})
>LessEqual[a : _node ..] := LessEqual @@ (First /@ {a})
>Greater[a : _node ..] := Greater @@ (First /@ {a})
>GreaterEqual[a : _node ..] := GreaterEqual @@ (First /@ {a})
>Protect[Less, Greater, LessEqual, GreaterEqual];
>
>Again because I'm lazy, I'll define "node" so that I don't have to
>consciously avoid leaf nodes and unnecessary nesting:
>
>ClearAll[node]
>node[a___, node[b_], c___] = node[a, b, c];
>node[node[a__]] = node[a];
>
>Here's my "add" function:
>
>Clear[add]
>add[Null, x_] = node[x];
>add[x_, (y_)?NumericQ] := add[node[x], node[y]]
>add[(x_)?NumericQ, y_] := add[node[x], node[y]]
>add[node[x_], node[y_]] := If[x > y, node[x, y], node[y, x]] (* <--
>THERE! *)
>add[node[x_, lower_], y_node] :=
> Which[y >= node[x], node[x, lower, y],
> y <= node[lower], node[lower, y, x], True, node[y, lower, x]]
>add[node[x_, lower_, higher_], y_node] /;
> node[y] <= node[x] := node[x, add[lower, y], higher]
>add[node[x_, lower_, higher_], y_node] := node[x, lower, add[higher,
>y]]
>
>Where I have a pointer is the code that prevents adding right-offspring
>to a leaf node. That avoids nodes like your node[a,Null,c]. When you
>would have node[a,b,Null], I have node[a,b]. When you would have
>node[a,Null,Null], I use a itself.
>
>Timing and space requirements are much better now:
>
>SeedRandom[5];
>nums = Null
>Timing[Do[nums = add[nums, Random[]], {5000}]; ]
>
>{2.109*Second, Null}
>
>LeafCount[nums]
>ByteCount[nums]
>Count[nums, Null, Infinity]
>Count[nums, node[_], Infinity]
>Count[nums, node[_, _], Infinity]
>Count[nums, node[_, _, _], Infinity]
>Count[nums, node[___], Infinity]
>Depth[nums]
>
>7852 (* 48 fewer "leaf" expressions,
> mostly due to Nulls and right-offspring eliminated *)
>165624 (* 31 less storage *)
>0 (* no Nulls, versus 5001 *)
>0 (* no leaf nodes, versus 1669 *)
> (* no nodes without left-offspring, versus 2458 *)
>705 (* nodes without right offspring, versus 2543 *)
>2146 (* nodes with both left and right offspring, versus 1667 *)
>2851 (* total nodes *)
>21 (* 21 tree levels versus 27 *)
>
>The logical structure is identical to yours except that there are no
>nodes with only right-offspring. This incidentally tends to reduce
tree
>depth.
>
>The storage format is far different: there are no trivial (childless)
>nodes, and nodes with left-offspring but no right-offspring are stored
>without a Null on the right.
>
>I could make this quite a bit faster, I think, if I spent time on the
>code to eliminate changes to Less, Greater, etc. and the frequent
>nesting followed by unnesting that goes on in the algorithm. It might
>take twice as much code, however, so I like it as is.
>
>You're probably better off storing these things in a heap, of course.
>
>Or -- better yet -- use the built-in Sort and Ordering functions.
>
>Bobby Treat
>
>-----Original Message-----
>From: Husain Ali Al-Mohssen [mailto:husain at MIT.EDU]
To: mathgroup at smc.vnet.net
>Subject: [mg36516] [mg36502] [mg36485] Why is my Implementation of Sorted Trees
So Slow?
>
>
>Hi all,
>
>I am trying to implement a very simple sorted tree to quickly store
some
>real numbers I need. I have written an add, delete, minimum, and pop
>(delete the lowest value) function and they seem to work ok but are
very
>slow. Let's just look @ my implementation of the add part:
>nums=Null;(*my initial blank Tree)
>In[326]:=
>Clear[add]
>
>In[327]:=
>add[Null,x_Real]:=node[x,Null,Null]
>
>add[Null,node[x_Real,lower_,higher_]]=node[x,lower,higher]
>
>In[328]:=
>add[node[x_Real,lower_,higher_],y_Real]:=
> If[x>y,node[x,add[lower,y],higher],node[x,lower,add[higher,y]]]
>
>In[288]:=
>add[node[x_Real,lowerx_,higherx_],node[y_Real,lowery_,highery_]]:=If[x>
y
>,
> node[x,add[lowerx,node[y,lowery,highery]],higherx],
> node[x,lowerx,add[higherx,node[y,lowery,highery]]]
> ]
>
>
>
>Now this is my attempt to test how fast my add works:
>
>SeedRandom[5];
>Do[nums=add[nums,Random[]],{5000}];//Timing
>
>Out[333]=
>{13.279 Second,Null}
>
>(running on Mathematica 4.1 on a win2k VMWare machine running on Linux
>RH7.3
>running on an 1.4GHz Athlon with 1GB of ram).
>
>Questions:
>1. Is this as fast as I can get my code to run?
>2. Am I doing something obviously stupid?
>3. would Compiling things help?
>
>
>Thanks,
>Husain
**************************************************************
Prev by Date:
RE: RE: RE: Generating Two Unit Orthogonal Vectors
Next by Date:
A symbol for Floor
Previous by thread:
RE: Why is my Implementation of Sorted Trees So Slow?
Next by thread:
Re: RE: Why is my Implementation of Sorted Trees So Slow?
|