MathGroup Archive 2003

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: split a list

  • To: mathgroup at smc.vnet.net
  • Subject: [mg40596] Re: split a list
  • From: Dr Bob <majort at cox-internet.com>
  • Date: Thu, 10 Apr 2003 03:44:47 -0400 (EDT)
  • References: <r01050400-1024-B84930966B0A11D7ABA800039380220E@[192.168.1.100]> <oprnedxauekcjuc2@smtp.cox-internet.com>
  • Reply-to: majort at cox-internet.com
  • Sender: owner-wri-mathgroup at wolfram.com

OK, first of all, I found a built-in solution:

<< Statistics`DataManipulation`
treat5[test_List, m_?NumericQ] := RangeLists[test, {m}]
treat5[test_List, m_List] := RangeLists[test, m]

Unfortunately, it isn't fast.  It's five times faster than brambilla, but 
also five times slower than kuska or treat3.

Secondly, I generalized my solution to lists of breakpoints (cutoffs).

testList[n_Integer?Positive] := Array[Random[] &, n]
trial[n_Integer?Positive] := {testList@n, Random[]}
trial[n_Integer, 1] := trial[n]
trial[n_Integer, k_Integer] := {testList@n, Sort@testList@k}

<< Experimental`
treat4[test_List, m_List] :=
  Flatten /@
    Reverse@Reap[
        Scan[Function[{x}, Sow[x,
     Count[m, _?(x â?¤ # &)]]], test], Range[0, Length@m]]
treat4[test_List, m_] :=
  Flatten /@ Reap[Scan[Sow[#, If[# â?¤ m, 1, 2]] &, test], {1, 2}]

test = trial[10, 5];
treat4 @@ test == RangeLists @@ test == treat5 @@ test

True

test = trial[100000, 5];
Timing[treat4 @@ test;]
Timing[RangeLists @@ test;]

{3.780999999999949*Second,   Null}
{7.4529999999999745*Second,  Null}

test = trial[100000, 25];
Timing[treat4 @@ test;]
Timing[RangeLists @@ test;]

{13.233999999999924*Second,   Null}
{12.186999999999898*Second,   Null}

test = trial[100000, 50];
Timing[treat4 @@ test;]
Timing[RangeLists @@ test;]

{25.077999999999975*Second,   Null}
{14.297000000000025*Second,   Null}

test = trial[100000, 100];
Timing[treat4 @@ test;]
Timing[RangeLists @@ test;]

{50.51499999999987*Second,   Null}
{17.686999999999898*Second,  Null}

treat4 wins for short cutoff lists, but RangeLists wins on longer ones.

Bobby

On Thu, 10 Apr 2003 00:09:00 -0500, Dr Bob <majort at cox-internet.com> wrote:

> Bill,
>
> Your solution (the same as Jens-Peer Kuska's) IS simpler and clearer 
> (though clarity is in the eye of the beholder!) -- but my timing 
> comparisons disagree with yours.  Perhaps my environment is different 
> (WinXP, 1024MB Ram, 2.2 GHz P4, Mathematica 4.2.1).  I have made a couple 
> of improvements to my function, too:
>
> treat3[test_List, m_] :=
> Flatten /@ Reap[Scan[Sow[#, If[# â?¤ m, 1, 2]] &, test], {1, 2}]
>
> I don't have a big speed advantage over Kuska, but I think my method will 
> scale to more breakpoints better, and I'm working on a solution for that 
> case.
>
> Perhaps you'd like to scale your method to more breakpoints?
>
> The function's pattern would look like breakList[test_List, m_List], 
> where m is unsorted.
>
> (The method I have in mind doesn't benefit from having m presorted.)
>
> Bobby
>
> On Wed,  9 Apr 2003 21:13:01 -0700, Bill Rowe <listuser at earthlink.net> 
> wrote:
>
>> On 4/9/03 at 8:27 PM, majort at cox-internet.com (Dr Bob) wrote:
>>
>>> This is a beautiful application for Sow and Reap.
>>>
>>> << Experimental` testList[n_Integer] := Array[Random[] &, n]
>>> trial[n_Integer] := {testList@n, Random[]}
>>>
>>> breakList[test_List, break_] := Reap[If[# < break, Sow[#, 1], Sow[#,
>>> 2]] & /@ test]
>>>
>>> trial@30 breakList @@ %
>>>
>>> That version may not be ideal, because if all elements are on the same
>>> side of the break point, you get only one list back -- not two, with
>>> one of them empty.  If that's a concern (but with thousands of
>>> elements it may never come up), this function fixes it:
>>>
>>> breakList2[test_List, break_] := Rest /@ Reap[Sow[0, #] & /@ {1,2};
>>> If[# < break, Sow[#, 1], Sow[#, 2]] & /@ test]
>>>
>>> Here's another solution to that problem:
>>>
>>> breakList3[test_List, break_] := Flatten /@ Reap[If[# < break, Sow[#,
>>> 1], Sow[#, 2]] & /@ test, {1, 2}]
>>
>> Interesting but the simple function
>>
>> sp[x_List,m_]:={Select[x,#<m&], Select[x, #>m&]}
>>
>> seems to be both more efficient and clearer as to intent
>>
>> data = Table[Random[],{1000000}];
>>
>> Timing[sp[data,.3];]
>> {1.41 Second,Null}
>>
>> Timing[breakList2[data,.3];]
>> {1.96 Second,Null}
>>
>> Timing[breakList3[data,.3];]
>> {1.71 Second,Null}
>>
>
>
>



-- 
majort at cox-internet.com
Bobby R. Treat



  • Prev by Date: RE: split a list
  • Next by Date: Formatting Indefinite Series Expressions
  • Previous by thread: Re: RE: split a list
  • Next by thread: Re: split a list