Re: grouping similar list elements with gaps
- To: mathgroup at smc.vnet.net
- Subject: [mg73529] Re: grouping similar list elements with gaps
- From: "Ray Koopman" <koopman at sfu.ca>
- Date: Wed, 21 Feb 2007 01:46:08 -0500 (EST)
- References: <200702181113.GAA13453@smc.vnet.net><erbgj9$dsg$1@smc.vnet.net>
On Feb 18, 10:37 pm, Stern <nycst... at gmail.com> wrote: > At any given time, I care only about the data above the threshold, or > only about the data below the threshold. There is nothing fuzzy about > that -- if I'm studying the periods where the variable is over 5, then > 5.00001 counts just as 500 does. What I am trying to capture in my > original question is the situation where there is a period over 5, > then a gap of one or two time units when it slips below 5, then a > period above 5 again. > > Thanks for any advice, > > Michael > > On 2/18/07, Chris Chiasson <c... at chiasson.name> wrote: >> >> how "far above or below" the threshold are you willing to go? >> >> On 2/18/07, Stern <nycst... at gmail.com> wrote: >>> I work with time series data of the form >>> {{timecode1,datum1},{timecode2,datum2},...}. The timecodes can be in >>> any of several formats, but for internal calculations I convert them >>> to "Mathematica integer" format, which is to say, the absolute number >>> of seconds since the beginning of January 1, 1900. >>> >>> My current interest involves continuous runs of dates above or below >>> a defined threshold. This is relatively easy, using the Split and >>> Select commands. For example, >>> >>> Select[Split[TIMESERIESLIST, Sign[#1[[2]] - THRESHOLD] == >>> Sign[#2[[2]] - THRESHOLD] &], (Min[Transpose[#][[2]]] =98 THRESHOLD >>> ) &] >>> >>> (Thanks to Bob Hanlon, for suggesting this basic approach). >>> >>> I would like to generalize this to handle cases where there are small >>> gaps in the pattern. So, for example, if I am willing to tolerate a >>> gap of 3, then if list members 3-100 are above the threshold and list >>> members 102-200 are above the threshold, then the entire period 3-200 >>> is marked as above, though time unit 101 would, on its own, fail. >>> >>> This may need to be handled recursively, as combined periods above the >>> threshold may fall close enough together that they should be combined >>> in turn. >>> >>> I have thought of some relatively inelegant ways of handling this >>> ("preprocessing" the time series to create a dummy list in which gaps >>> have been adjusted over the threshold), but I feel as though there >>> ought to be a better way to handle it. >>> >>> Thanks in advance for any help, >>> >>> Michael >> >> -- >> http://chris.chiasson.name/ First append a "wanted" indicator to each {t,d} term, then split as before. x = Split[ Append[#, Last@# > THRESHOLD]& /@ TDLIST, Last@#1 == Last@#2 &]; Wanted and unwanted blocks alternate. Merge the wanted blocks that are separated by TMIN or fewer periods, including the intervening originally-unwanted blocks. (This is a little klutzy, but it seems to work.) k = 1 + Boole@x[[2,1,3]]; While[k + 2 <= Length@x, If[x[[k,-1,1]] + TMIN < x[[k+2,1,1]], k += 2, x = Insert[Drop[x,{k,k+2}],Join@@Take[x,{k,k+2}],k]]]; Finally, select the wanted blocks, and strip the indicators. Map[Most, Select[x, #[[1,3]]&], {2}]
- References:
- grouping similar list elements with gaps
- From: Stern <nycstern@gmail.com>
- grouping similar list elements with gaps