MathGroup Archive 1998

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: Split in Mathematica 2.2

  • To: mathgroup at smc.vnet.net
  • Subject: [mg12885] Re: [mg12822] Re: [mg12777] Split in Mathematica 2.2
  • From: Andrzej Kozlowski <andrzej at tuins.ac.jp>
  • Date: Wed, 24 Jun 1998 03:44:11 -0400
  • Sender: owner-wri-mathgroup at wolfram.com

At 1:26 AM -0700 6/13/98, Xah Lee wrote:
>Here are some more constructs that emulate Split. Note that the argument to
>Split need not have head List.
>
>(*best. Fastest too. Modified from Andrzej Kozlowski's code.*)
>Clear[split];
>split[li_,fQ_:SameQ]:=
>  Module[{h=(Head at li)},
>    h@@((Take[li,#1]&)/@(
>              Transpose[{Flatten[{1,#1+1}],Flatten[{#1,Length at li}]}]&)@(
>                Position[#,h[a_,b_]/;Not at fQ[a,b],{1}]&)@Partition[li,2,1])];
>
>(*same idea, but written procedurally.*)
>Clear[split2]
>split2[li_,fQ_:SameQ]:=Module[{liLength,index,i},liLength=Length at li;
>index={};For[i=1,i<liLength,i++,
>      If[Not@(fQ@@Part[li,{i,i+1}]),index={index,i}]];
>(Head at li)@@((
>            Take[li,#]&)/@(
>              Transpose[{Flatten[{1,#+1}],Flatten[{#,liLength}]}]&)@index)];
>
>(*using the idea of recursive pattern matching. Slow and memory intensive,
>as expected. Can anyone come up with a faster version with this approach? *)
>Clear[split3];
>split3[li_,fQ_:SameQ]:=Module[{h=Head at li,g,hTemp},
>g[frest___,a_,b_,rest___]/;Not at fQ[a,b]:=hTemp[g[frest,a],g[b,rest]];
>		g[a__]:=h[a];
>		Flatten[h[g@@li],Infinity,hTemp]];
>
>(* another potential variety is Andrzej Kozlowski's original code using
>Fold. I havn't studied it. It may be the most elegant.*)
>
>----------------
> testing
>
>Clear[li];
>li=h at Table[h@Random[Integer,{1,3}],{300}];
>
>In[251]:=
>result=(Timing at #[li]&)/@{Split,split,split2,split3};
>
>In[252]:=
>First at Transpose@result
>SameQ@@(Last at Transpose@result)
>
>Out[252]=
>{0.0166667 Second,0.116667 Second,0.266667 Second,0.516667 Second}
>
>Out[253]=
>True
>
> Xah, xah at best.com
> http://www.best.com/~xah/SpecialPlaneCurves_dir/specialPlaneCurves.html
> "Tumor growth and variations: Unix C C++ Java sed awk sh csh Perl"

My own tests show that that Carl Woll's version:

split4[li_,testQ_:SameQ] := Module[{r1,r2},
    r1=Flatten[Position[Apply[testQ,Partition[li,2,1],{1}],False]];
    r2=Transpose[{Join[{1},r1+1],Join[r1,{Length[li]}]}];
    Take[li,#]&/@r2]

is not only the fastest that has been produced in this thread but that
it also quite clearly beats the built-in Split in Mathematica 3.0, at
least on my G3 Mac. In fact, even the best result mentioned in Xah
Lee's messgae usually beats the built-in Split for large lists. This
seems to me a pretty remarkable achievement. We are always told (in
books on Mathematica programming) that we should use built-in functions
as thay are the fastest. It doesn't seem to be true in this case! Or is
this only true on a G3 Mac? Has anyone ever heard of any other such
examples (built-inn funnctions being slower than user programmed ones)?

Here are my results:

w = Table[Random[Integer], {100000}];

split4[w];//Timing
{5.41667 Second,Null}
split[w];//Timing
{6.31667 Second,Null}
Split[w];//Timing
{7. Second,Null}

(Split is the built-in function.)
Andrzej Kozlowski



  • Prev by Date: RE: Dynamic Programming
  • Next by Date: RE: Position on a List
  • Previous by thread: Re: Split in Mathematica 2.2
  • Next by thread: Re: Re: Split in Mathematica 2.2