Re: Fast List Manipulation & more

*To*: mathgroup at smc.vnet.net*Subject*: [mg22981] Re: [mg22965] Fast List Manipulation & more*From*: Hartmut Wolf <hwolf at debis.com>*Date*: Sat, 8 Apr 2000 14:44:48 -0400 (EDT)*Organization*: debis Systemhaus*References*: <200004070654.CAA16531@smc.vnet.net>*Sender*: owner-wri-mathgroup at wolfram.com

Mark Van De Vyver schrieb: > > I have been looking over previous posts re: fast list extraction, > manipulation, etc., and it seems that to get the 'fastest' routine depends > on the specific nature of the problem. I have not been able to recast my > problem into one of the existing mg examples, partly I suspect because I > cannot fully trandlate what is at time fairly cryptic code so I am posting > this request even though there are some excellent pieces of example code out > there (e.g. [mg12872] and the excerpts below) I've included (what seems to > be the fastest) snippets from previous posts that I thought might be > helpful. > > The problem is essentially this (using Mathematica 4.0 on NT 4.0 server): > > With list l={{i,x(i),y(i),..}}, and list il={i} extract the Max and Min of > all the x's, y's,... for each i. > In another setting we may also use an InterpolatingFunction, and in another > we want just the maximums of the x's and the maximums of just the y's > etc.but these can wait, unless there aresimple, fast solutions? > > il is built from l, but the i's are not neccessarily contiguous, both lists > can be small, growing large Length[il]=1 to 2000, Length[l]=1 to 100,000. > The relationship between the length of il and l is relatively > straightforward, if il is length n then l is length k(n)+n, where k is a > linear function of n, typically 3-50. > > Here i, x, y,.. are Real i, can be large (+/-) and x, y typically in the > range of -1 to1, unlikely to ever be 9+ (I've noticed in some examples that > this can make a difference!). In a seperate setting the x,y can be large, > in the order of 10^2 > One other criterion is memory :) > If a large l is fed in a small result list, say r, may be returned > Length[il]===Length[r]. > Since this is going to be looping v.v.v. frequently it's important that Mathematica > free memory, as well as be fast. (To be honest I've always stuggled to get > Mathematica to give back what it takes in memory, even following the usual > suggestions.) > > What I'd like to know is if anyone can come up with something fatser than > what I have below, and what do people think of some alternatives ideas > (hacked from other MG posts) that I've not been able to get to work :( > > Some ideas I had were to alter ss3 in [mg20688] to take the Max and Min > rather then look at an interval at a time, though I'm not sure how to > efficiently handle looking over x ,y, ... > > (* a slightly altered ss3 from [mg20688] *) > ss3[events_List, gaps_Interval] := > Module[{t = Length[events], > se = Sort[events], > bag = {}}, > Do[While[t > 0 && se[[t]] > gaps[[i, 2]], --t]; > While[t > 0 && se[[t]] >= gaps[[i, 1]], bag = {bag, t}; --t], {i, > Length[gaps], 1, -1}]; Delete[se, List /@ Flatten[bag]]] > > It may also be possible to alter [mg12872], but now rather than > /matchingselecting a single value from the larger list, we need to build a > list of matching values... > Another idea was to hack [mg21058] to build a list for each i and then look > over that for the maximum and minmum, again though I was having trouble > seeing how to modify this to tract the whole sequnces sublists {i, x, y,...} > for each i by modifying: > > (#2- #1) & @@@ > Cases[DeleteCases[Partition[t, 2, 1], Alternatives @@ nt, {2}], {_, _}] > > (* some data *) > n = 500; > tmp = Sort[ Flatten[Table[N[Random[Integer, {-n, n}]], {i, n}], 1]]; > l[n] = Table[{tmp[[i]] , Random[Real, {-9, 9}], Random[Real, {-9, 9}]}, > {i,n}]; > il[n] = Union[tmp]; > ClearAll[tmp, n]; > > (* MV1 - no laughing :) *) > mv1= > Compile[{{il, _Real, 1}, {l, _Real, 2}}, > Table[({il[[j]], Max[#1], Min[#1]}) & > [#1[[2]] & /@ Select[#1, #1[[1]] == il[[j]] &] & > [l]], > {j, 1, Length[il]} > ] > ]; > > (* Timing Pent Pro 200 *) > > mv1[il[20], l[20]]; // Timing > mv1[il[100], l[100]]; // Timing > mv1[il[500], l[500]]; // Timing > mv1[il[1000], l[1000]]; // Timing > > Out[171]= > {0. Second, Null} > > Out[172]= > {0.031 Second, Null} > > Out[173]= > {0.157 Second, Null} > > Out[174]= > {3.734 Second, Null} > > Out[175]= > {15.687 Second, Null} > Mark, I'm not shure whether I understand you, but this are my contemplations: First to set up the test data (and the problem) SeedRandom[7700660088] ...this is such that you can reproduce my results, and compare. Any test starts here. n = 500; ...this varied in the test with n=500, n=1000, n=100000. tmp = Sort[Flatten[Table[N[Random[Integer, {-n, n}]], {i, n}], 1]]; l = Table[{tmp[[i]], Random[Real, {-9, 9}], Random[Real, {-9, 9}]}, {i, n}]; xtr = Union[Table[{Random[Integer, {1, n}]}, {40}]] ...in the tests I used n/10 instead of 40 here. il = Union[Extract[tmp, xtr]] Length[il] 37 So not all elements of il need be in First/@ l, but there is none of il not in First/@ l To define the problem we execute a simple procedure #1, not meant to be the production program, also used to verify a faster program. pos = Flatten[Position[l, {#, _, _}] & /@ il, 1] ... we get the positions of the il in l, there are Length[pos] 54 ...hits (some more as are elements of il) el = Extract[l, pos] ...we extract the elements of l at those positions Union @ (First /@ el) === il True ...correct eel = Transpose[Rest /@ el]; {Min[#], Max[#]} & /@ eel {{-8.76991, 8.91636}, {-8.90909, 8.9885}} ...this is the minimum and maximum of the x and the y from the positions as determined by il. Now two proposals: procedure #2: Off[Part::"partw"] Module[{tl = 1, ti = 0, leni = Length[il], min = Infinity {1, 1}, max = -Infinity {1, 1}, vals}, While[++ti <= leni, While[l[[tl, 1]] < il[[ti]], tl++]; While[l[[tl, 1]] == il[[ti]], min = MapThread[Min, {min, vals = Rest[l[[tl++]]]}]; max = MapThread[Max, {max, vals}]] ]; Transpose[{min, max}]] {{-8.76991, 8.91636}, {-8.90909, 8.9885}} On[Part::"partw"] ...when il refers to the last element of l, there will be the error message, but then all the work has already been done and the result is correct. procedure #3 ff[{min_, max_}, {i_, x_, y_}] := If[x === -n, ilast = i; {min, max}, If[i === ilast, {MapThread[Min, {min, {x, y}}], MapThread[Max, {max, {x, y}}]}, {min, max}]] Module[{iil, sl}, iil = Transpose[{il, #, #}] &[Table[-n, {Length[il]}]]; sl = Sort[Join[ iil, l]]; Block[{ilast = Undefined}, Transpose[ Fold[ff, {{Infinity, Infinity}, {-Infinity, -Infinity}}, sl]]] ] {{-8.76991, 8.91636}, {-8.90909, 8.9885}} Now the results from timing on my machine (the first ondex on "time" is n, the second is the # of the procedure: time[500, 1] = {1.1420000000000101*Second, {{-8.769911152534853, 8.91635750026802}, {-8.909087068099602, 8.98850295101392}}} time[500, 2] = {0.12999999999999545*Second, {{-8.769911152534853, 8.91635750026802}, {-8.909087068099602, 8.98850295101392}}} time[500, 3] = {0.20999999999999375*Second, {{-8.769911152534853, 8.91635750026802}, {-8.909087068099602, 8.98850295101392}}} time[1000, 1] = {4.646000000000001*Second, {{-8.84899469707297, 8.687342737143382}, {-8.96776630933692, 8.933065726429106}}} time[1000, 2] = {0.2599999999999909*Second, {{-8.84899469707297, 8.687342737143382}, {-8.96776630933692, 8.933065726429106}}} time[1000, 3] = {0.42099999999999227*Second, {{-8.84899469707297, 8.687342737143382}, {-8.96776630933692, 8.933065726429106}}} time[100000, 2] = {24.796000000000035*Second, {{-8.998994945020392, 8.999526141047475}, {-8.998287850696661, 8.999946643578806}}} time[100000, 3] = {44.584*Second, {{-8.998994945020392, 8.999526141047475}, {-8.998287850696661, 8.999946643578806}}} You see that both procedures #2 and #3 seem to be linear in n. This is should be the case for #2, yet not quite for #3 since there is a Sort in it. But #1 appears to be at least quadratic in n and I avoided timing it with n=100000. Finally a remark: In procedure #3 I used -n to mark off iil iil = Transpose[{il, #, #}] &[Table[-n, {Length[il]}]]; I would have like to use -Infinity instead, but to my greatest astonishment I observed Sort[{Infinity, -Infinity, 5, -5}] {-5, 5, -\[Infinity], \[Infinity]} OrderedQ[%] True Max[{-Infinity, 5}] 5 Min[{-Infinity, 5}] -\[Infinity] Min[{Infinity, 5}] 5 Max[{Infinity, 5}] \[Infinity] Hartmut

**References**:**Fast List Manipulation & more***From:*"Mark Van De Vyver" <mvyver@bigfoot.com>