Re: Why is recursion so slow in Mathematica?
- To: mathgroup at smc.vnet.net
- Subject: [mg100545] Re: Why is recursion so slow in Mathematica?
- From: Szabolcs Horvát <szhorvat at gmail.com>
- Date: Sun, 7 Jun 2009 05:03:11 -0400 (EDT)
- References: <h0d6s8$spq$1@smc.vnet.net>
Daniel wrote:
> This post is about functional programming in Mathematica versus other
> functional languages such as OCaml, SML or Haskell. At least a naive
> use of functional constructs in Mathematica is horrendously slow. Am I
> doing something wrong? Or isn't Mathematica really suitable for
> functional programming beyond toy programs? Couldn't the Wolfram team
> make a more efficient implementation for recursion, as other
> functional languages has done? (Instead of the naive C-like behavior
> for recursively defined functions.)
>
> As grounds for my question/argument, I wrote my own version of select,
> as below
>
> myselect[{}, predicate_] = {}
> myselect[{head_, tail___}, predicate_] := If[predicate[head],
> Join[{head}, myselect[{tail}, predicate]],
> myselect[{tail}, predicate]
> ]
>
> Then I tried this function on a 20.000 element vector with machine
> size floats:
>
> data = Table[Random[], {20000}];
> $RecursionLimit = 100000;
> Timing[data2 = myselect[data, # > 0.5 &];]
>
> The result is {7.05644, Null}, and hundreds of MB of system memory are
> allocated. On 1.7 GHZ dual core Intel machine with 1 GB of RAM. For
> 20.000 floats! It's just a megabyte!
>
> The following OCaml program executes in apparently no-time. It is not
> compiled and does the same thing as the above Mathematica code. After
> increasing the list by a factor of ten to 200.000 elements, it still
> executes in a blink. (But with 2.000.000 elements my OCaml interpreter
> says Stack overflow.)
>
> let rec randlist n = if n=0 then [] else Random.float(1.0) :: randlist
> (n-1);;
>
> let rec myselect = function
> [],predicate -> []
> | x::xs,predicate -> if predicate(x) then x::myselect(xs,predicate)
> else myselect(xs,predicate);;
>
> let mypred x = x>0.5;;
>
> let l=randlist(20000);;
> let l2=myselect(l,mypred);; (* lightning-fast compared to Mathematica
> *)
>
So you discovered that appending to arrays is slow while appending to
linked lists is fast :)
Actually you are not comparing the exact same algorithms. The data
structure you used in those languages was a linked list. Removing or
appending elements takes a constant time for linked lists.
Mathematica's List is a vector-type data structure, so
removing/appending elements takes a time proportional to the vector's
length.
One data structure is not better than the other, of course, they're just
useful for different purposes. For example, a linked list is unsuitable
for applications where the elements need to be accessed randomly instead
of sequentially. Random access is often needed for numerical
computations/simulations.
For a fair comparison, use a linked list in Mathematica too. This could
look like {1,{2,{3,{}}}}
myselect[{}, test_] = {};
myselect[{head_, tail_}, test_] :=
If[test[head], {head, myselect[tail, test]}, myselect[tail, test]]
toLinkedList[list_List] := Fold[{#2, #1} &, {}, list]
data = toLinkedList@Table[Random[], {20000}];
Block[{$RecursionLimit = \[Infinity]},
Timing[myselect[data, # > 0.5 &];]]
This runs in 0.2 sec on my (not very fast) system.
And you can check that the function works correctly:
Block[{$RecursionLimit = \[Infinity]},
Flatten@myselect[data, # > 0.5 &] ===
Select[Flatten[data], # > 0.5 &]]
About the thing that you call "naive C-like behavior": you probably mean
that Mathematica doesn't automatically perform the tail-call
optimization. That is correct, you need to transform your recursions
explicitly.
Here's an example:
myselect2[dest_, {head_, tail_}, test_] :=
If[test[head],
myselect2[{head, dest}, tail, test],
myselect2[dest, tail, test]]
myselect2[dest_, {}, test_] := dest
Note that it is $IterationLimit that we need to increase now and this
version doesn't fill up the evaluation stack.
Also note that this produces the result in the reverse order:
Block[{$IterationLimit = \[Infinity]},
Reverse@Flatten@myselect2[{}, data, # > 0.5 &] ===
Select[Flatten[data], # > 0.5 &]]
We could produce the result in the same order, but reverse linking (like
{{{{},1},2},3}).
I couldn't do any better than this using a singly-linked list. It looks
like OCaml (which I don't know BTW) uses a singly linked list too, and
can't do the tail-call optimization, so its stack gets filled up.
One more thing: even when using the same data structure and same
algorithm, it is of course expected that a very high level interpreted
language like Mathematica is not going to perform nearly as well as a
low level compiled language like OCaml. In high level languages the
usual solution to this performance problem is built-in functions: don't
implement Select in Mathematica itself. Use the built-in one, which is
written in C and much faster than a Mathematica implementation could ever be.
Hope this helps,
Szabolcs
- Follow-Ups:
- Re: Why is recursion so slow in Mathematica?
- From: Daniel <janzon@gmail.com>
- Re: Re: Why is recursion so slow in Mathematica?
- From: DrMajorBob <btreat1@austin.rr.com>
- Re: Why is recursion so slow in Mathematica?