MathGroup Archive: July 1999 [00015]

[Date Index] [Thread Index] [Author Index]

Split on the Macintosh

To: mathgroup at smc.vnet.net
Subject: [mg18406] Split on the Macintosh
From: Andrzej Kozlowski <andrzej at tuins.ac.jp>
Date: Wed, 7 Jul 1999 00:11:05 -0400
Sender: owner-wri-mathgroup at wolfram.com

Today at last I got my Mathematica 4.0 upgrade, much earlier than I ever 
expected. I am very grateful to the mathgroup whose very existence has
undoubtedly contributed to this, from my point of view,  remarkable and
happy event.

The very first thing I did after installing Mathematica 4 was to go over my own
personal list of bugs and problems that plagued the previous version to
check which have and which have not been fixed. I am very pleased to be
able to say that the first group is far larger (if one excludes functions
contained in various packages, which appear to have not been changed at
all. Particularly, given the improved performance of the built-in Limit
function, it seems to me that one should now generally avoid the
Calculus`Limit` package more then ever).

However, the problem that I was most curious about still persists. It was
something discovered  about a year ago by Xah Lee, Carl Woll and myself
and concerns the Split function. At that time I wanted to emulate it in
mathematica 2.2 and we produced various candidates and compared their
efficiency. Then we discovered a curious thing. Our best attempt was the
following function:

split2[li_, testQ_:SameQ] := 
  Module[{r1, r2}, 
    r1 = Flatten[Position[Apply[testQ, Partition[li, 2, 1], {1}], False]];
    r2 = Transpose[{Join[{1}, r1 + 1], Join[r1, {Length[li]}]}];
    Take[li, #] & /@ r2]

We found that on the Macintosh alone (strictly speaking we only tested
PPC Macs) this function is more efficient than the built in Split for
very large lists. Not only that, its lead over Split grows with the size
of the list. This is not true on other platforms where the built-in Split
is much faster. Moreover, careful investigation revealed that on the Mac
the built in Split function does not scale linearly but seems to have a
non-negligible quadratic component, while split2 is linear. On other
platforms, however, Split scales linearly.

This is strange. One reason is that the Kernel is supposed to work the
same way on all platforms so one would imagine that basic things as
asymptotic growth of functions would be roughly the same. Moreover, the
algorithms used certainly scale linearly: after all if we were able to
produce a linear function that one would certainly expect wri to be able
to do so.

Xah Lee reported this problem to wri's techincal support. So I was
curious to see if there would be any change in this respect in version 4. 

I have just run my tests and found that while Split has become
considerably faster in version 4, it still shows non-linear asymptotic
behaviour and is still beaten by split2 for v. large lists. Those who
would like to see these results (valid only on the Mac) can download the
notebook:
<ftp://cleo.tuins.ac.jp//pub/Splitvsplit.nb>

The most likely explanation seems to be that this might be something to
do with the memory management of the Mac. Yet it is strange that it seems
to affect only just one function, and that wri have not been able to do
anything about it. This is not necessarily a trivial matter because it is
precisely when dealing with very large lists that Split becomes very
useful. Another thing that it illustrates (in my opinion) is the
difficulties in applying purely theoretical efficiency arguments in the
case of a program like Mathematica in which built in function splay such
a big role.

Andrzej Kozlowski

Toyama International University
JAPAN
http://sigma.tuins.ac.jp/
http://eri2.tuins.ac.jp/

Prev by Date: Re: [Q] Graphics with 2 Y-axis

Next by Date: New tutorial book

Previous by thread: Re: [Q] Graphics with 2 Y-axis

Next by thread: Re: Split on the Macintosh