Split on the Macintosh
- To: mathgroup at smc.vnet.net
- Subject: [mg18406] Split on the Macintosh
- From: Andrzej Kozlowski <andrzej at tuins.ac.jp>
- Date: Wed, 7 Jul 1999 00:11:05 -0400
- Sender: owner-wri-mathgroup at wolfram.com
Today at last I got my Mathematica 4.0 upgrade, much earlier than I ever expected. I am very grateful to the mathgroup whose very existence has undoubtedly contributed to this, from my point of view, remarkable and happy event. The very first thing I did after installing Mathematica 4 was to go over my own personal list of bugs and problems that plagued the previous version to check which have and which have not been fixed. I am very pleased to be able to say that the first group is far larger (if one excludes functions contained in various packages, which appear to have not been changed at all. Particularly, given the improved performance of the built-in Limit function, it seems to me that one should now generally avoid the Calculus`Limit` package more then ever). However, the problem that I was most curious about still persists. It was something discovered about a year ago by Xah Lee, Carl Woll and myself and concerns the Split function. At that time I wanted to emulate it in mathematica 2.2 and we produced various candidates and compared their efficiency. Then we discovered a curious thing. Our best attempt was the following function: split2[li_, testQ_:SameQ] := Module[{r1, r2}, r1 = Flatten[Position[Apply[testQ, Partition[li, 2, 1], {1}], False]]; r2 = Transpose[{Join[{1}, r1 + 1], Join[r1, {Length[li]}]}]; Take[li, #] & /@ r2] We found that on the Macintosh alone (strictly speaking we only tested PPC Macs) this function is more efficient than the built in Split for very large lists. Not only that, its lead over Split grows with the size of the list. This is not true on other platforms where the built-in Split is much faster. Moreover, careful investigation revealed that on the Mac the built in Split function does not scale linearly but seems to have a non-negligible quadratic component, while split2 is linear. On other platforms, however, Split scales linearly. This is strange. One reason is that the Kernel is supposed to work the same way on all platforms so one would imagine that basic things as asymptotic growth of functions would be roughly the same. Moreover, the algorithms used certainly scale linearly: after all if we were able to produce a linear function that one would certainly expect wri to be able to do so. Xah Lee reported this problem to wri's techincal support. So I was curious to see if there would be any change in this respect in version 4. I have just run my tests and found that while Split has become considerably faster in version 4, it still shows non-linear asymptotic behaviour and is still beaten by split2 for v. large lists. Those who would like to see these results (valid only on the Mac) can download the notebook: <ftp://cleo.tuins.ac.jp//pub/Splitvsplit.nb> The most likely explanation seems to be that this might be something to do with the memory management of the Mac. Yet it is strange that it seems to affect only just one function, and that wri have not been able to do anything about it. This is not necessarily a trivial matter because it is precisely when dealing with very large lists that Split becomes very useful. Another thing that it illustrates (in my opinion) is the difficulties in applying purely theoretical efficiency arguments in the case of a program like Mathematica in which built in function splay such a big role. Andrzej Kozlowski Toyama International University JAPAN http://sigma.tuins.ac.jp/ http://eri2.tuins.ac.jp/