MathGroup Archive: October 1995 [00118]

[Date Index] [Thread Index] [Author Index]
Re: Puzzle
To: mathgroup at smc.vnet.net
Subject: [mg2290] Re: Puzzle
From: wagner at goober.cs.colorado.edu (Dave Wagner)
Date: Mon, 23 Oct 1995 12:40:31 -0400
Organization: University of Colorado, Boulder
In article <DGMpFD.ICL at wri.com>, Will Self <wself at viking.emcmt.edu> wrote:
>Here's a nice puzzle.  Find the four numbers in the following sequence
>which total 100:  {6, 44, 30, 15, 24, 12, 33, 23, 18}.  It would seem that
>there might be a nice way to do this with Mathematica.  Any takers?
>
>Will Self
>Billings, Montana

Right after my last posting I realized that the code could be optimized
in a number of ways.  Here's how:

First, here's the initial version of the code.

In[68]:=
    Clear[sumto]
    sumto[seq_, k_, n_] := sumto[seq, k, n] =
    Module[{i, tmp},
	    Do[ tmp = sumto[Drop[seq, {i}], k-1, n-seq[[i]]];
		    If[tmp =!= Null, Return[Append[tmp, seq[[i]]]]],
		    {i, Length[seq]}
	    ]
    ]
    sumto[_, 0, 0] := {}
    sumto[___] := Null

This is Will's data.

In[72]:=
    nums = {6, 44, 30, 15, 24, 12, 33, 23, 18};

It takes over 3 seconds.

In[73]:=
    sumto[nums, 4, 100] // Timing
Out[73]=
    {3.23333 Second, {18, 23, 15, 44}}

The first observation is that the above code allows calls to sumto
in which the third argument is negative.  As soon as that happens,
failure is assured (since all of the integers are positive).  Therefore,
if we assume that all of the integers in the list will be positive,
we can add the rule sumto[_, _, _?Negative] := Null to speed things up:

In[87]:=
    Clear[sumto]
    sumto[seq_, k_, n_] := sumto[seq, k, n] =
    Module[{i, tmp},
	    Do[ tmp = sumto[Drop[seq, {i}], k-1, n-seq[[i]]];
		    If[tmp =!= Null, Return[Append[tmp, seq[[i]]]]],
		    {i, Length[seq]}
	    ]
    ]
    sumto[_, 0, 0] := {}
    sumto[_, _, _?Negative] := Null
    sumto[___] := Null

That cut the running time in half.

In[92]:=
    sumto[nums, 4, 100] // Timing
Out[92]=
    {1.45 Second, {18, 23, 15, 44}}

The next optimization is to note that as soon as sumto[_, 0, x] is
called where x is non-zero, failure has occurred.  This case
can be trapped using sumto[_, 0, _] := Null (no check for the
third argument being non-zero is necessary because there is
a special case for sumto[_, 0, 0] already):

In[93]:=
    Clear[sumto]
    sumto[seq_, k_, n_] := sumto[seq, k, n] =
    Module[{i, tmp},
	    Do[ tmp = sumto[Drop[seq, {i}], k-1, n-seq[[i]]];
		    If[tmp =!= Null, Return[Append[tmp, seq[[i]]]]],
		    {i, Length[seq]}
	    ]
    ]
    sumto[_, 0, 0] := {}
    sumto[_, 0, _] := Null
    sumto[_, _, _?Negative] := Null
    sumto[___] := Null

This makes a huge difference in running time:

In[99]:=
    sumto[nums, 4, 100] // Timing
Out[99]=
    {0.583333 Second, {18, 23, 15, 44}}

The final optimization is a very subtle one.  In the recursive call to
sumto, rather than eliminating the current element from the list using
Drop[seq, {i}], we can eliminate all elements up to and including the
i'th one using Drop[seq, i].  The reason that this is correct is that,
if any of the elements 1, 2, ... i-1 were part of the solution, then
that solution would already have been found.  Since no solution has
been found yet, then there is no need to consider those elements
anymore.

As a concrete example, consider sumto[{1,2,3}, 2, 5].  In the original
algorithm, this results in a call to sumto[{1,3}, 1, 3] (when the 2 is
eliminated).  But we already know that 1 is not part of the solution,
hence we can instead call sumto[{3}, 1, 3].

Here's the result:

In[100]:=
    Clear[sumto]
    sumto[seq_, k_, n_] := sumto[seq, k, n] =
    Module[{i, tmp},
	    Do[ tmp = sumto[Drop[seq, i], k-1, n-seq[[i]]];
		    If[tmp =!= Null, Return[Append[tmp, seq[[i]]]]],
		    {i, Length[seq]}
	    ]
    ]
    sumto[_, 0, 0] := {}
    sumto[_, 0, _] := Null
    sumto[_, _, _?Negative] := Null
    sumto[___] := Null

In[106]:=
    sumto[nums, 4, 100] // Timing
Out[106]=
    {0.383333 Second, {18, 23, 15, 44}}

I believe that this final modification makes result-caching unnecessary,
as I now obtain slightly faster times without it:

In[240]:=
    Clear[sumto];
    sumto[seq_, k_, n_] :=
    Module[{i, tmp},
	    Do[ tmp = sumto[Drop[seq, i], k-1, n-seq[[i]]];
		    If[tmp =!= Null, Return[Append[tmp, seq[[i]]]]],
		    {i, Length[seq]}
	    ]
    ];
    sumto[_, 0, 0] := {};
    sumto[_, 0, _] := Null;
    sumto[_, _, _?Negative] := Null;
    sumto[___] := Null;

In[246]:=
    sumto[nums, 4, 100] // Timing
Out[246]=
    {0.316667 Second, {18, 23, 15, 44}}

I'd like to say something precise about which of these changes has the
biggest effect by testing it on a variety of problems of different
sizes, but it's difficult to come up with a good benchmark for this
problem.

Being too lazy to do a theoretical analysis of the algorithm, I was
hoping someone else already had.  What I found in a brief literature
search was the following:

First, this problem is clearly a constrained version of the integer
knapsack problem, which can be stated thusly (Aho, Hopcroft & Ullman,
"The Design and Analysis of Computer Algorithms", Addison-Wesley, 1974,
page 401):

	"Given a sequence of integers S = i1, i2, ... in, and
	an integer k, is there a subsequence of S that sums
	exactly to k?"

A,H&U state that this problem is NP-complete, which means that no
polynomial-time solution is known for it.  However, there's a crucial
difference between the problem stated by Will Self and this one: in
Will's version, one is constrained to find *four* numbers that sum
to a given integer, whereas in the unconstrained version there could
be 1, 2, ..., n numbers that sum to it.  This constraint, together
with the assumption that all of the numbers are positive, reduces
the computational complexity of the problem quite significantly.

On the other hand, Cormen, Leiserson, and Rivest ("Introduction to
Algorithms", McGraw-Hill, 1990, page 335) pose a different version of
the knapsack problem in which there is a set of n items, each having a
value vi and a weight wi.  The problem is, given a knapsack with total
capacity (in weight) of W, which items should be put in the knapsack to
maximize the total value?  C,L&U state on page 336 (in an exercise)
that a dynamic programming algorithm with a time complexity of only n*W
exists for this problem.  This probelm is qualitatively different from
the others in that the eventual total may be less than W (e.g., the
knapsack may not be full), as long as the total value is maximized.

		Dave Wagner
		Principia Consulting
		(303) 786-8371
		dbwagner at princon.com
		http://www.princon.com/princon
Prev by Date: Re: TeX labels in graphs
Next by Date: ISSAC'96 First Announcement and Call for Papers
Previous by thread: Re: Puzzle
Next by thread: Mathematica 2.2.3 under Win95