Re: Fast List Manipulation & more - some results
- To: mathgroup at smc.vnet.net
- Subject: [mg23052] Re: Fast List Manipulation & more - some results
- From: Mark Van De Vyver <mvdv at bigfoot.com>
- Date: Sat, 15 Apr 2000 03:00:17 -0400 (EDT)
- References: <8cme1k$1vb@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Further to my previous post and Hartmuts suggestions I've a) clarified (I think) the problem desc, b) provided some improved versions with timings I make some some -*** observations ***- along the way in case anyone is interested.. also if any of the gurus read this and think it's about as fast as can be achieved, could they post a brief 'that's as good as it will get' note it just to let me (and possibly others) know. Note the example code I posted was incorrect in that the list structure fed in was not as I'd thought it was. The following functions all do the same thing, though output format may vary. a) The Problem(s) (some code to generate an example data set is at the end of this post) Take a list, l, in the form {{i,x,y}}, and il in the form {i} l={{1., 3.82217, 7.49982}, {5., 0.802604, 5.05635}, {3., 4.80672, 0.269348}, {5., 7.63638, 5.88913}, {2., 4.14006, 0.0792415}, {3., 7.51374, 8.02024}, {4., 6.09094, 1.64494}, {5., 1.15211, 1.22434}, {3., 0.173347, 2.83916}, {2., 4.7505, 5.24301}}; il={1., 2., 3., 4., 5.}; initially the problem was to generate two lists, rl1 and rl2. rl1 is the max of ALL x's and y's for each i, i.e. {i,Max[x's,y's],Min[x's,y's]} rl2 is the cummulative list of x's and y's for each i, i.e. {i,x's,y's} A second version, which seemed too involved and which Hartmut posted a partial solution for was to take the Max/Min of the x's and the Max/Min of the y's seperately, which was easy when he pointed out using MapThread. i.e. {i,{Max[x's],Min[x's]},{Max[x's],Min[x's]}} rl2 is the cummulative list of x's and y's for each i, i.e. {i,{x's},{y's}} For the moment we leave aside the issue of generating rl2. *** It is generating rl1 for the last version of this problem that this post relates to, since the first version, and rl2 should be simpler cases (?) *** So: rl1={{1., {3.82217, 7.49982}, {3.82217, 7.49982}}, {2., {4.7505, 5.24301}, {4.14006, 0.0792415}}, {3., {7.51374, 8.02024}, {0.173347, 0.269348}}, {4., {6.09094, 1.64494}, {6.09094, 1.64494}}, {5., {7.63638, 5.88913}, {0.802604, 1.22434}}}} My initial function was wrong and is very slow for larger problems though it seems fastest for small versions where number Length[il]=5 and Length[l]=10 The timing table output in the code below shows (for different relative list lengths): Length[il] is approx quarter Length[l] n m[2] m[3] m[4] m[5] m[6] m[7] 10 0.047 0.016 0.014 0. 0.016 0.016 50 0.047 0.061 0.063 0.061 0.063 0.061 100 0.108 0.110 0.108 0.110 0.125 0.125 1000 2.5 1.233 1.280 1.188 1.25 1.202 3000 16.625 3.280 3.327 3.235 3.264 3.438 10000 178.43 12.36 12.93 12.719 12.84 12.46 Length[il] is approx half Length[l] n m[2] m[3] m[4] m[5] m[6] m[7] 10 0.015 0.014 0.016 0.014 0. 0.015 50 0.062 0.062 0.077 0.063 0.061 0.078 100 0.11 0.13 0.14 0.14 0.13 0.141 1000 4.233 1.344 1.375 1.265 1.343 1.327 3000 34.20 3.953 4.093 3.922 3.969 4.046 10000 389.63 15.76 16.18 15.48 15.57 15.93 Length[il] is approx Length[l] n m[2] m[3] m[4] m[5] m[6] m[7] 10 0.047 0.016 0.014 0.016 0.016 0.014 50 0.077 0.079 0.077 0.077 0.094 0.077 100 0.172 0.186 0.202 0.172 0.202 0.188 1000 8.717 1.704 1.764 1.641 1.688 1.719 3000 73.93 5.905 6.016 5.858 5.719 6.061 10000 876.9 16.78 17.87 15.87 17. 17.87 I've some questions which I hope someone can answer.. 1) For m[3],m[4] etc why does placing the definition of u2 inside the module m[3], etc. speed things slightly? the code below has u1 ans u2 defined ourtside of m[*], but drop it inside and you'll see what I mean. 2) For m[6] vs m[7] and m[5] vs m[6] why does using Dispatch not make much of a difference 3) Why does applying Compile not work, note you'll need to use the Dispatch versions since Compile does not seem to like the /. x_?NumberQ versions. However even then it still refuses to use the Compiled version. 4) Any remarks on m[4] vs m[5] From what I've been reading in the posts of this group I really expected m[5] using dispatch to be much faster - (see [mg12872] Re:Re:Efficient way to merge lists--Programming help) 5) Can anyone tweak these further? Since m[?] will be used many, many times in an iteration slight improvements will be significant. Now some remarks about each function. A requirement is that the function m[*] should allow mi and ma to be initialised externally, i.e. so we can loop over m[] passing different lists, l for the same il. I've included initialisation inside m[*] inorder to facilitate building the timing table... m[1]: This is Hartmuts earlier post altered to iterate over the whole of l, and il. It's good for small il and l, and in it's original form. m[2]: Scan over with an update function that replaces an element of a list of maximums and minimums. m[3]: Same as m[2] except the update function now sets the value of a variable rather than replace part of a list. This seems to be the most significant seed improvement. -**** Defining u2 within the module for m[] speeds things slightly again ****- m[4] thru m[7]: These are variations using Dispatch for one /. and /@ instead of Scan. TIA Mark (********** START CUT and PASTE HERE **************) (* Define a function to update the max and min values *) u1[{x_, y_, z___}, il_List] := Module[{pos, vals = {y, z}}, pos = Position[il, x][[1, 1]]; allmin[[pos]] = MapThread[Min, {Part[allmin, pos], vals}]; allmax[[pos]] = MapThread[Max, {Part[allmax, pos], vals}]; (* End of Module *)]; u2[{x_, y_, z___}] := Module[{omi = mi[x], oma = ma[x], vals = {y, z}}, mi[x] = MapThread[Min, {omi, vals}]; ma[x] = MapThread[Max, {oma, vals}]; (* End of Module *)]; m[1][il_List, l_List] := Module[{tl = 1, ti = 0, lenil = Length[il], lenl = Length[l], min, max,vals, rl}, Off[Part::"partw"]; rl = {}; While[++ti <= lenil, tl = 1; min = Infinity {1, 1}; max = -min; While[l[[tl, 1]] != il[[ti]], tl++; (* EoWhile l[[tl, 1]] != *)]; While[l[[tl, 1]] === il[[ti]] && tl <= lenl, min = MapThread[Min, {min, vals = Rest[l[[tl]]]}]; max = MapThread[Max, {max, vals}]; tl++; While[l[[tl, 1]]1 != il[[ti]], tl++;(* EoWhile l[[tl, 1]] != *)]; (* EoWhile l[[tl, 1]] == *)]; rl = {rl, {il[[ti]], max, min}}; (* EoWhile ++ti *)]; On[Part::"partw"]; Partition[Flatten[rl], 5]] m[2][il_List, l_List] := Module[{i = 1, j = 1, k = 1}, ClearAll[allmin, allmax]; Off[Set::"partw", Part::"partw"]; allmin = Table[{Infinity, Infinity}, {Length[il]}]; allmax = -allmin; Scan[(u1[#, il]) &, l]; On[Set::"partw", Part::"partw"]; il /. x_?NumberQ :> {il[[i++]], allmax[[j++]], allmin[[k++]]} (* End of Module *)] m[3][il_List, l_List] := Module[{i = 1, j = 1, k = 1, tmpl}, Off[Set::"partw", Part::"partw"]; (mi[#] = {Infinity, Infinity}) & /@il; (ma[#] = -{Infinity, Infinity}) & /@ il; Scan[(u2[#]) &, l]; On[Set::"partw", Part::"partw"]; tmpl = il /. x_?NumberQ :> {x, ma[x], mi[x]}; ClearAll[mi, ma]; tmpl (* End of Module *)] m[4][il_List, l_List] := Module[{i = 1, j = 1, k, tmpl}, Off[Set::"partw", Part::"partw"]; r = Dispatch[# -> {#, ma[#], mi[#]} & /@ il]; (mi[#] = {Infinity, Infinity}) & /@ il; (ma[#] = -{Infinity, Infinity}) & /@ il; Scan[(u2[#]) &, l]; On[Set::"partw", Part::"partw"]; tmpl = il /. r; ClearAll[r]; tmpl (* End of Module *)] m[5][il_List, l_List] := Module[{i = 1, j = 1, k, tmpl}, Off[Set::"partw", Part::"partw"]; r = Dispatch[# -> {#, ma[#], mi[#]} & /@ il]; (mi[#] = {Infinity, Infinity}) & /@ il; (ma[#] = -{Infinity, Infinity}) & /@ il; (u2[#]) & /@ l; On[Set::"partw", Part::"partw"]; tmpl = il /. r; ClearAll[mi, ma, r]; tmpl (* End of Module *)] m[6][il_List, l_List] := Module[{i = 1, j = 1, k, tmpl}, Off[Set::"partw", Part::"partw"]; (mi[#] = {Infinity, Infinity}) & /@ il; (ma[#] = -{Infinity, Infinity}) & /@ il; l /. {x_, y_, z_} :> (u2[{x, y, z}]); tmpl = il /. x_?NumberQ :> {x, ma[x], mi[x]}; On[Set::"partw", Part::"partw"]; ClearAll[mi, ma, r]; tmpl (* End of Module *)] m[7][il_List, l_List] := Module[{i = 1, j = 1, k, tmpl}, Off[Set::"partw", Part::"partw"]; r = Dispatch[# -> {#, ma[#], mi[#]} & /@ il]; (mi[#] = {Infinity, Infinity}) & /@ il; (ma[#] = -{Infinity, Infinity}) & /@ il; l /. {x_, y_, z_} :> (u2[{x, y, z}]); tmpl = il /. r; On[Set::"partw", Part::"partw"]; ClearAll[mi, ma, r]; tmpl (* End of Module *)] (* Generate data for timing table *) Function[{n}, timel[n] = Table[{N[Random[Integer, {1, n/2}]], Random[Real, 9], Random[Real, 9]}, {n}];] /@ {10, 50, 100, 1000, 3000, 10000}; Function[{n}, timeil[n] = Union[First /@ timel[n]];] /@ {10, 50, 100, 1000, 3000, 10000}; (* Generate Timing Table *) TableForm[ Table[ (m[i][timeil[#], timel[#]]; // Timing // First)/Second, {i, 2, 7} ] & /@ {10, 50, 100, 1000, 3000, 10000} , TableHeadings -> {{10, 50, 100, 1000, 3000, 10000}, Table[m[i], {i, 2, 7}]}] (********** END CUT and PASTE HERE **************) -- Mark Van De Vyver Department of Accounting and Finance University of Western Australia mailto:mvdv at bigfoot.com http://www.bigfoot.com/~mvdv Sent via Deja.com http://www.deja.com/ Before you buy.