Functional solution to union of intervals
- To: Mathematica user's group <MathGroup at yoda.physics.unc.edu>
- Subject: Functional solution to union of intervals
- From: Robby Villegas <Villegas at knox.bitnet>
- Date: Thursday, June 4 1992
Hello, About the pattern-matching solution to the union-of-intervals problem: a beautiful one-liner! Yet again, I'm highly impressed with the clever applications patterns can be put to in Mathematica. It's always fun to do things in a slew of different ways in Mathematica, though (as somebody remarked in the past, this is one of the characteristics of the language that makes it interesting), so here's an alternative method which, albeit not as beautiful, is still a very Mathematica way of doing things, at least in some aspects: unite[{maximals___, {a_, b_}}, {x_, y_}] := {maximals, {a, Max[b, y]}} /; N[a <= x <= b] unite[{maximals___, {a_, b_}}, {x_, y_}] := {maximals, {a, b}, {x, y}} unionComponents = Fold[unite, { First[sortedList] }, Rest[sortedList] ]; Preliminaries: With 'intervals' containing the original, unsorted list of intervals, 'sortedList' has been assigned 'Sort[intervals]'. Or if you want to allow the endpoints of an interval to be given as {right, left}, as someone suggested in another posting, then sortedList = Map[Sort, intervals, {0, 1}] will sort both the individual intervals and the list of them (and in the right order, since Map with a level-specification hits deeper levels before shallower ones. We want sorting by left endpoint for this 'unite' function). The scheme of the "program" is to build up a list of maximal intervals by moving from right to left through the sorted list of ordered pairs, coalescing intervals that have any overlap. At each step, it will compare the current interval to the last interval in the list it is accumulating, and if they overlap it will unite them and replace the last interval in the accumulating list with this union. If they don't overlap, then the current interval is guaranteed to be disjoint from it and all the previous ones, so the current one is pushed on to the end of the accumulating list. I can't do timing comparisons right now, but someone mentioned that doing a single scan through the list would make the algorithm faster, and this proceeds from beginning to end on the sorted list of intervals, so it might be reasonably quick. I intend to try, when I get a chance, a single definition for 'unite' using a conditional like 'If[ N[a <= x <= b], ...]', and/or using a pure function (which seemed at first glance to require more complicated references to sub-parts) like 'Function[{accumList, next}, <formula>]', but I suspect they will not be faster, and perhaps slower. Others can attack the efficiency of the codes, meanwhile. I tested this on a few lists generated by commands like intervals = Table[ Random[Integer, {-100, 100}] + RandomVector[2, Integer, {-4, 4}], {100} ]; which created sets of intervals with a decent number of chains of overlapping intervals, and with results suitable for the old visual union operation so I could verify easily. So far, it seems to work. By the way, the function RandomVector[n, type, range] simply calls Table[Random[type, range], {n}], so it gives an n-vector of random numbers. Just another application to reaffirm my love for Fold, the trusty slick way to avoid loops and procedures! Not to mention FoldList, great for debugging the first few, faulty functions you choose to Fold... :-) Robby Villegas Knox College (Villegas at Knox.Bitnet)