RE: List processing

*To*: mathgroup at smc.vnet.net*Subject*: [mg37241] RE: [mg37196] List processing*From*: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>*Date*: Fri, 18 Oct 2002 05:17:01 -0400 (EDT)*Sender*: owner-wri-mathgroup at wolfram.com

>-----Original Message----- >From: John Leary [mailto:leary at paradise.net.nz] To: mathgroup at smc.vnet.net >Sent: Wednesday, October 16, 2002 8:26 PM >To: mathgroup at smc.vnet.net >Subject: [mg37241] [mg37196] List processing > > >Greetings > >This problem can be solved by conventional programming, but I >wonder if >there is an elegant Mathematica solution ? > >A list contains pairs of values, with each pair representing >the lower and >upper edge of a sub-range. Some of the sub-ranges partially >overlap, some >fully overlap, others don't overlap at all. The problem is to >produce a >second list that contains the overall upper and lower edges of the >overlapping sub-ranges. > >A simple example : {{100,200},{150,250},{120,270},{300,400}} >would result >in {{100,270},{300,400}}. > >In the real case, the input list has several hundred elements and the >output list typically has five elements. > >I have a working solution based on loops, but there must be a >more elegant >one. I would be very grateful for any suggestions. > > >Best regards > >John Leary > > John, several proposals (without any attempt to moduralize): (1) use IntervalUnion: List @@ IntervalUnion @@ Interval /@ list (2) use Split (it's a little bit tricky to be correct): high = Sequence[]; {#[[1, 1]], Max[#[[All, -1]]]} & /@ Split[Sort[list], (high = Max[high,Last[#1]]) >= First[#2] || (high = Last[#1])&] (3) procedural programming: maxExtends[list_] := (sl = Sort[list]; length = Length[sl]; r = collect[]; i = 1; While[i <= length, {low, high} = sl[[i]]; If[++i <= length, {curlow, curhigh} = sl[[i]]; While[high >= curlow && (high = Max[high, curhigh]; ++i <= length), {curlow, curhigh} = sl[[i]] ]]; r = collect[r, {low, high}] ]; List @@ Flatten[r]) Let's do some benchmarks: 10,000 Intervals: list = {# - Random[], # + Random[]} & /@ NestList[# + Random[] &, 0, 10000]; List @@ IntervalUnion @@ Interval /@ list // Length // Timing {2.503 Second, 1181} high = Sequence[]; {#[[1, 1]], Max[#[[All, -1]]]} & /@ Split[Sort[ list], (high = Max[high, Last[#1]]) >= First[#2] || (high = Last[#1]) &] // Length // Timing {2.934 Second, 1181} maxExtends[list] // Length // Timing {3.926 Second, 1181} The corresponding results for 100,000 Intervals: {27.329 Second, 11266} {30.234 Second, 11266} {35.791 Second, 11266} and for 500,000 Intervals {144.058 Second, 56728} {154.782 Second, 56728} {181.111 Second, 56728} To look at scaling behaviour I just collected the prior results IntervalUnion: {%355, %345 , %350}[[All, 1, 1]] {2.503, 27.329, 144.058} % // {#[[2]]/(10*#[[1]]), #[[3]]/(5*#[[2]])} & {1.09185, 1.05425} Split: {%357, %347, %352}[[All, 1, 1]] {2.934, 30.234, 154.782} % // {#[[2]]/(10*#[[1]]), #[[3]]/(5*#[[2]])} & {1.03047, 1.02389} Procedural: {%358, %348, %353}[[All, 1, 1]] {3.926, 35.791, 181.111} % // {#[[2]]/(10*#[[1]]), #[[3]]/(5*#[[2]])} & {0.91164, 1.01205} Due to Sort, the Split and the Procedural versions should behave as O[n log n], I'm not shure whether IntervalUnion does (seems to be a little bit more progressive at costs). -- Hartmut Wolf