MathGroup Archive: February 2001 [00175]

[Date Index] [Thread Index] [Author Index]
(long) Re: Speed Challenge
To: mathgroup at smc.vnet.net
Subject: [mg27202] (long) Re: [mg27159] Speed Challenge
From: Daniel Lichtblau <danl at wolfram.com>
Date: Fri, 9 Feb 2001 03:10:51 -0500 (EST)
Sender: owner-wri-mathgroup at wolfram.com
"Gareth J. Russell" wrote:
> 
> For the efficiency and speed demons among you: a challenge.
> 
> Take a list of integers that represents, say, the number balls in each of a
> set of boxes. I want the fastest code that removes ONE ball at random. The
> ball is chosen at random from all the balls (rather than a box being
> randomly chosen). Here is my suggestion, and its speed over 10,000
> iterations on my system:
> 
> removeone = Compile[{{l, _Integer, 1}}, Module[{chosenball, chosenbox},
>         chosenball = Random[Integer, {1, Apply[Plus, l]}];
>         chosenbox =
>           Position[
>               Map[If[# ? chosenball, 1, 0] &, Drop[FoldList[Plus, 0, l],
> 1]],
>               1][[1, 1]];
>         MapAt[# - 1 &, l, chosenbox]
>         ]
>       ];
> 
> Do[removeone[{5, 4, 7, 6, 5, 4, 7, 8, 9, 4, 5, 6}], {10000}]; // Timing
> 
> {1.06667 Second, Null}
> 
> I think there must be faster way to do the middle expression (chosenbox =
> ...). Can you beat it!
> 
> Gareth
> 
> P.S. I do have a practical use for this...
> 
> ==================================================
> Dr. Gareth J. Russell
> University of Tennessee
> Columbia University (as visitor)
> 
> MAILING ADDRESS (new as of 15TH JANUARY, 2001)
> 
> 110 Geosciences
>   Lamont-Doherty Earth Observatory
>     Columbia University
>       61 Route 9W
>         Palisades, NY 10964, U.S.A.
> 
> PHONE: ++1 845 365 8683
> FAX: ++1 845 365 8156 (not conveniently located, avoid if possible)
> E-MAIL: russell at cerc.columbia.edu   (NO CHANGE)
> WWW: http://web.utk.edu/~grussell   (NO CHANGE)
> ==================================================

I fixed some typos. Also note that I use "box" and "bin" interchangeably
below.

removeone = Compile[{{l,_Integer,1}}, Module[
	{chosenball,chosenbox},
	chosenball = Random[Integer, {1,Apply[Plus,l]}];
	chosenbox = Position[
	  Map[If[# >= chosenball, 1, 0] &,
	  Drop[FoldList[Plus,0,l],1]],1][[1,1]];
	MapAt[# - 1 &, l, chosenbox]
	]];

mylist = {5, 4, 7, 6, 5, 4, 7, 8, 9, 4, 5, 6};

In[82]:= Timing[Do[removeone[mylist], {10000}]]
Out[82]= {1.58 Second, Null}

You can save some of time (factor > 2) by not summing the list each call
and by not doing that FoldList and Position. For this you would prepend
to your list the sum of the box sizes, and decrement it in the result
along with the size of the relevant bin.

removeonetoo = Compile[{{ll,_Integer,1}},
    Module[{chosenball,chosenbox=2,sum,res=ll},
        chosenball = Random[Integer, {1,ll[[1]]}];
	    sum = ll[[2]];
	    While[chosenball>sum,
            chosenbox++;
            sum += ll[[chosenbox]];
            ];
        res[[1]]--;
        res[[chosenbox]]--;
        res
        ]
    ];

In[84]:= InputForm[newlist = Prepend[mylist, Apply[Plus,mylist]]]
Out[84]//InputForm= {70, 5, 4, 7, 6, 5, 4, 7, 8, 9, 4, 5, 6}

In[85]:= removeonetoo[newlist]
Out[85]= {69, 5, 4, 7, 6, 5, 4, 7, 8, 9, 3, 5, 6}

In[86]:= Timing[Do[removeonetoo[newlist], {10000}]]
Out[86]= {0.59 Second, Null}


There are two other ways to do this sort of thing, and which of the
possibilities you will want to use depends on some input sizes. I'll
show the simpler one first. It is essentially the fast shuffle I posted
in a thread several weeks ago.

To start, you "inflate" your bins by repeating the bin number once for
every ball it contains. Prepend to this a "pointer" telling us where to
start. When called, swap the ball in the start position with a ball
randomly chosen between there and the end, and increment the start
value. We do this one a bit differently to avoid recreating at every
call a large list. Instead we'll alter in place by making the list a
held argument. Unfortunately we now lose the ability to use Compile.

removeonetwostep[ll_] := Module[
	{start=ll[[1]]+1,ball,tmp},
	(*ll[[1]]++;*)
	ball = Random[Integer,{start,Length[ll]}];
	ll[[{start,ball}]] = ll[[{ball,start}]];
	ll
	]

In[113]:= InputForm[newlist2 =
  Prepend[Flatten[MapIndexed[Table[#2,{#1}]&, mylist]], 1]]   
Out[113]//InputForm= 
{1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5,
5, 
 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9,
9, 
 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 11, 12, 12, 12, 12,
12, 12}

In[114]:= InputForm[removeonetwostep[newlist2]]
Out[114]//InputForm= 
{1, 9, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5,
5, 
 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9,
9, 
 9, 9, 9, 9, 1, 9, 10, 10, 10, 10, 11, 11, 11, 11, 11, 12, 12, 12, 12,
12, 12}

For real runs we would uncomment the increment line, and the first
position above would be '2' rather than '1'.

If I use this on newlist2 10000 times it is a bit slow. If instead we do

mylistbig = Table[Random[Integer,{1,100}], {500}];
newlistbig = Prepend[mylistbig, Apply[Plus,mylistbig]];
newlistbig2 =
  Prepend[Flatten[MapIndexed[Table[#2,{#1}]&, mylistbig]], 1];

then this is a bit faster than removeonetoo. If you have alot of bins
but few balls per bin, this may be much faster. In that case you might
want to use Compile and abandon the call-by-reference (the HoldAll
attribute, that is), instead explicitly creating a return value
initialized to the input and altering that instead of the input list.

Here are some trouble spots. If you have a large total number of balls
(say 10^9), then removeonetwostep is out of the question because of the
inflation step. Moreover removeone and removeonetoo have a problem when
the number of bins gets large. Each time called they must do a linear
search to find the bin from which they removed a ball. One way to
improve on this is to use a tree structure. As I want to maintain use of
Compile this gets a bit messy. So I'll walk through the data structure
and code.

Our "tree" will be implemented as a rank-2 tensor of nodes. Each will be
a 4-tuple of the form

{left_pointer,right_pointer,number_to_left,number_in_node}

The "pointers" are simply position numbers in the list of tuples. The
"number_to_left" refers only to the total in the left-side subtree.

We augment with a 4-tuple at the end whose elements are

{total_in_set, center_node, chosen_ball, 0}

We first assume we have this structure (I'll discuss its construction
later). Then we get (gotta love the name) removeonetootree as follows.

(i) Pick a ball between 1 and total.

(ii) Decrement total and store the chosen ball in the last tuple in the
tree.

(iii) Start at the center node. If the number to the left is less than
our ball, and the number to left + number in node is greater than or
equal to our ball, then it came from the center node. Else if number to
left is larger, walk to the left subtree. Else walk to the right
subtree.

(iii-a) To walk to left, we must modify "number to the left" It will now
mean "total number in bins less than current bin" rather than just
number in the left subtree. To get this we subtract the number we had to
the left, go left, and add the number to the left of that subtree.

(iii-b) To walk right, add the number in the node we are leaving (as it
resides in a bin less than the one we are entering), and also add the
number in the left subtree of the bin we enter.

(iv) When we arrive at the appropriate bin, remove one element from its
ball count. Now return the altered tree.

Here is the code.

removeonetootree = Compile[{{ll,_Integer,2}}, Module[
	{ball,bin,sum,res=ll,node,posn,numtoleft,tlen=Length[ll]},
	ball = Random[Integer, {1,res[[tlen,1]]}];
	res[[tlen,1]]--;
	res[[tlen,3]] = ball;
	posn = res[[tlen,2]];
	node = res[[posn]];
	numtoleft = node[[3]];
	While[!(numtoleft<ball && numtoleft+node[[4]]>=ball),
		If [numtoleft>=ball, (* walk to left *)
			res[[posn,3]]--;
			numtoleft -= node[[3]];
			posn = node[[1]];
			node = res[[posn]];
			numtoleft += node[[3]],
		(* else walk to right *)
			numtoleft += node[[4]];
			posn = node[[2]];
			node = res[[posn]];
			numtoleft += node[[3]];
			];
		];
	res[[posn,4]]--;
	res
    ]]

We'll check that it really "compiled" (first, because my early efforts
did not do so successfully, and second because reading the input line
has a nice ring).

In[173]:= removeonetootree[[4]] // InputForm
Out[173]//InputForm= 
{{1, 5}, {12, 0, 1}, {57, 0, 0}, {4, 1, 1}, {64, 1, 0, 0, 0, 1, 0, 2}, 
 {14, 0, 2, 0}, {4, 1, 1}, {14, 0, 1, 1}, {23, 2, 1, 0, 3}, {4, 1, 2}, 
 {64, 1, 0, 0, 0, 2, 0, 1}, {9, 1, 2}, {4, 1, 4}, {4, 1, 5}, 
 {64, 1, 0, 0, 0, 5, 0, 6}, {4, 1, 5}, {32, 5, 7}, {24, 6, 7, 6}, 
 {65, 1, 0, 0, 0, 4, 0, 6}, {4, 3, 1}, {65, 1, 0, 0, 0, 1, 0, 3}, {4, 2,
1}, 
 {64, 1, 0, 0, 0, 1, 0, 6}, {64, 1, 0, 6, 1, 2}, {4, 3, 1}, 
 {64, 2, 0, 1, 0, 4}, {48, 4, 3, 0}, {4, 4, 1}, {64, 2, 0, 1, 0, 7}, 
 {24, 4, 7, 1}, {50, 3, 1, 1}, {52, 0, 1, 2}, {55, 2, 0}, {41, 0, 42}, 
 {50, 3, 4, 2}, {41, 2, 26}, {4, 3, 1}, {64, 1, 0, 6, 0, 1, 0, 7}, {9,
7, 1}, 
 {4, 3, 5}, {4, 3, 8}, {64, 1, 0, 6, 0, 8, 0, 9}, {4, 1, 8}, {32, 8,
10}, 
 {24, 9, 10, 9}, {65, 1, 0, 6, 0, 5, 0, 9}, {4, 3, 7}, {64, 2, 0, 7, 0,
9}, 
 {32, 9, 7}, {24, 4, 7, 9}, {9, 9, 4}, {4, 1, 9}, {64, 2, 0, 9, 0, 7}, 
 {9, 7, 6}, {64, 1, 0, 6, 1, 3}, {12, 3, 2}, {4, 3, 7}, {64, 2, 0, 7, 0,
9}, 
 {24, 4, 9, 7}, {9, 7, 4}, {42, 14}, {4, 4, 1}, {64, 2, 0, 1, 0, 7}, 
 {24, 4, 7, 1}, {9, 1, 4}, {4, 2, 1}, {64, 2, 0, 1, 0, 7}, {9, 7, 6}, 
 {64, 1, 0, 6, 1, 3}, {12, 3, 2}, {4, 3, 7}, {64, 2, 0, 7, 0, 1}, 
 {24, 4, 1, 7}, {9, 7, 4}, {42, -48}, {4, 4, 7}, {64, 1, 0, 6, 0, 7, 0,
1}, 
 {9, 1, 7}, {4, 4, 9}, {4, 4, 5}, {64, 1, 0, 6, 0, 5, 0, 10}, {4, 1, 5}, 
 {32, 5, 8}, {24, 10, 8, 10}, {65, 1, 0, 6, 0, 9, 0, 10}, {2}}

All tuples of numbers, so it compiled clean.

So how do we make the trees? I did this in steps. First is a procedure
that will return a (non-tensor) triple of the form

{total, position_of_center_4_tuple, {4_tuples}}

This is convenient because it can be called recursively on left and
right sublists of the input, with those put together along with a center
node. To avoid degenerate cases we handle lists of length 1 and 2
separately.

toTree[ll_List] /; Length[ll]==1 := {ll[[1,2]], ll[[1,1]],
	{{0,0,0,ll[[1,2]]}}}

toTree[ll_List] /; Length[ll]==2 := {ll[[1,2]]+ll[[2,2]], ll[[2,1]],
	{{0,0,0,ll[[1,2]]}, {ll[[1,1]],0,ll[[1,2]],ll[[2,2]]}}}

toTree[ll_List] := Module[
	{mid=Ceiling[(Length[ll]+1)/2], left, right, node},
	left = toTree[Take[ll,mid-1]];
	right = toTree[Drop[ll,mid]];
	node = {left[[2]], right[[2]], left[[1]], ll[[mid,2]]};
	{left[[1]]+right[[1]]+ll[[mid,2]], ll[[mid,1]],
	  Join[left[[3]], {node}, right[[3]]]}
	]

To put this into the tensor we need we augment with the last 4-tuple as
described above.

treetensor[ll_] :=
With[{tree=toTree[Transpose[{Range[Length[ll]],ll}]]},
	Developer`ToPackedArray[Join[tree[[3]], {{tree[[1]],tree[[2]],0,0}}]]]

In[181]:= InputForm[forest = treetensor[mylist]]
Out[181]//InputForm= 
{{0, 0, 0, 5}, {1, 3, 5, 4}, {0, 0, 0, 7}, {2, 6, 16, 6}, {0, 0, 0, 5}, 
 {5, 0, 5, 4}, {4, 10, 31, 7}, {0, 0, 0, 8}, {8, 0, 8, 9}, {9, 12, 17,
4}, 
 {0, 0, 0, 5}, {11, 0, 5, 6}, {70, 7, 0, 0}}

In[182]:= removeonetootree[forest]//InputForm
Out[182]//InputForm= 
{{0, 0, 0, 5}, {1, 3, 5, 4}, {0, 0, 0, 7}, {2, 6, 16, 6}, {0, 0, 0, 5}, 
 {5, 0, 5, 4}, {4, 10, 31, 7}, {0, 0, 0, 8}, {8, 0, 8, 9}, {9, 12, 17,
4}, 
 {0, 0, 0, 5}, {11, 0, 5, 5}, {69, 7, 66, 0}}

In[183]:= Timing[Do[removeonetootree[forest], {10000}];]
Out[183]= {0.98 Second, Null}

We see that this is competitive with the other methods despite the
relatively intricate tree manipulation. Now let's try some examples with
10^5 bins.

mylistverybig = Table[Random[Integer,{1,1000}], {10^5}];
newlistverybig = Prepend[mylistverybig, Apply[Plus,mylistverybig]];
amazonrainforest = treetensor[mylistverybig];

Note that this offline computation of the tree tensor is itself slow,
and of course is a full iteration over the list. So this strategy is not
useful unless there will be alot of iterations. Also the tree building
code relies on Mathematica recursion which can be slow. So for alot of
bins it would be best to rewrite that using iterative code.

In[219]:= Timing[Do[removeonetoo[newlistverybig], {100}];]
Out[219]= {12.45 Second, Null}

In[221]:= Timing[Do[removeonetootree[amazonrainforest], {100}];]
Out[221]= {3.05 Second, Null}


----------------------------------

Getting the absolute best speed can be a bit tricky and I'd not be
surprised if there are yet faster ways to do this sort of thing; I am
hoping I did not miss one that is entirely obvious. That said, I do
claim to have given efficient programming in Mathematica a fair amount
of thought over the past few years. Some useful ideas related to those
discussed in this post may be found at the URLs below. The first covers
both deck shuffling and a similar tree example. The second deals more
generally with effective data structure use in Mathematica.

http://library.wolfram.com/conferences/conference98/abstracts/symbolic_FAQ.html

http://library.wolfram.com/conferences/devconf99/

For this second one, scroll to "Programming" section at the bottom of
the page, and you can click on  the Mathematica notebook option under
heading

"Enhance Your Code Performance: Effective Data Structures Techniques in
Mathematica"

An HTML version may be found in the same location, or more directly at:

http://library.wolfram.com/conferences/devconf99/lichtblau/

----------------------------------


Daniel Lichtblau
Wolfram Research
Follow-Ups:
- Re: (long) Re: Speed Challenge
  - From: Daniel Lichtblau <danl@wolfram.com>
Prev by Date: Re: strange behavior
Next by Date: Palm OS Mathematica
Previous by thread: Re: J/Link MathCanvas/Graphics/Interaction
Next by thread: Re: (long) Re: Speed Challenge