Re: Fast List Manipulation & more - some results

• To: mathgroup at smc.vnet.net
• Subject: [mg23052] Re: Fast List Manipulation & more - some results
• From: Mark Van De Vyver <mvdv at bigfoot.com>
• Date: Sat, 15 Apr 2000 03:00:17 -0400 (EDT)
• References: <8cme1k\$1vb@smc.vnet.net>
• Sender: owner-wri-mathgroup at wolfram.com

```Further to my previous post and Hartmuts suggestions I've
a) clarified (I think) the problem desc,
b) provided some improved versions with timings

I make some some -*** observations ***- along the way in case anyone is
interested..  also if any of the gurus read this and think it's about
as fast as can be achieved, could they post a brief 'that's as good as
it will get' note it just to let me (and possibly others) know.

Note the example code I posted was incorrect in that the list structure
fed in was not as I'd thought it was.  The following functions all do
the same thing, though output format may vary.

a) The Problem(s)
(some code to generate an example data set is at the end of this post)
Take a list, l, in the form {{i,x,y}}, and il in the form {i}

l={{1., 3.82217, 7.49982}, {5., 0.802604, 5.05635}, {3., 4.80672,
0.269348}, {5., 7.63638, 5.88913}, {2., 4.14006, 0.0792415}, {3.,
7.51374, 8.02024}, {4., 6.09094, 1.64494}, {5., 1.15211, 1.22434}, {3.,
0.173347, 2.83916}, {2., 4.7505, 5.24301}};

il={1., 2., 3., 4., 5.};

initially the problem was to generate two lists, rl1 and rl2.
rl1 is the max of ALL x's and y's for each i,
i.e. {i,Max[x's,y's],Min[x's,y's]}
rl2 is the cummulative list of x's and y's for each i,
i.e. {i,x's,y's}

A second version, which seemed too involved and which Hartmut
posted a partial solution for was to take the Max/Min of the x's and
the Max/Min of the y's seperately, which was easy when he pointed out

i.e. {i,{Max[x's],Min[x's]},{Max[x's],Min[x's]}}
rl2 is the cummulative list of x's and y's for each i,
i.e. {i,{x's},{y's}}
For the moment we leave aside the issue of generating rl2.

*** It is generating rl1 for the last version of this problem that this
post relates to, since the first version, and rl2 should be simpler
cases (?) ***

So:

rl1={{1., {3.82217, 7.49982}, {3.82217, 7.49982}}, {2., {4.7505,
5.24301}, {4.14006, 0.0792415}}, {3., {7.51374, 8.02024}, {0.173347,
0.269348}}, {4., {6.09094, 1.64494}, {6.09094, 1.64494}}, {5.,
{7.63638, 5.88913}, {0.802604, 1.22434}}}}

My initial function was wrong and is very slow for larger problems
though it seems fastest for small versions where number Length[il]=5
and Length[l]=10

The timing table output in the code below shows (for different relative
list lengths):

Length[il] is approx quarter Length[l]
n	m[2]	m[3]	m[4]	m[5]	m[6]	m[7]
10	0.047	0.016	0.014	0.	0.016	0.016
50	0.047	0.061	0.063	0.061	0.063	0.061
100	0.108	0.110	0.108	0.110	0.125	0.125
1000	2.5	1.233	1.280	1.188	1.25	1.202
3000	16.625	3.280	3.327	3.235	3.264	3.438
10000	178.43	12.36	12.93	12.719	12.84	12.46

Length[il] is approx half Length[l]
n	m[2]	m[3]	m[4]	m[5]	m[6]	m[7]
10	0.015	0.014	0.016	0.014	0.	0.015
50	0.062	0.062	0.077	0.063	0.061	0.078
100	0.11	0.13	0.14	0.14	0.13	0.141
1000	4.233	1.344	1.375	1.265	1.343	1.327
3000	34.20	3.953	4.093	3.922	3.969	4.046
10000	389.63	15.76	16.18	15.48	15.57	15.93

Length[il] is approx Length[l]
n	m[2]	m[3]	m[4]	m[5]	m[6]	m[7]
10	0.047	0.016	0.014	0.016	0.016	0.014
50	0.077	0.079	0.077	0.077	0.094	0.077
100	0.172	0.186	0.202	0.172	0.202	0.188
1000	8.717	1.704	1.764	1.641	1.688	1.719
3000	73.93	5.905	6.016	5.858	5.719	6.061
10000	876.9	16.78	17.87	15.87	17.	17.87

I've some questions which I hope someone can answer..

1)  For m[3],m[4] etc why does placing the definition of u2 inside the
module m[3], etc. speed things slightly? the code below has u1 ans u2
defined ourtside of m[*], but drop it inside and you'll see what I mean.

2)  For m[6] vs m[7] and m[5] vs m[6] why does using Dispatch not make
much of a difference

3)  Why does applying Compile not work, note you'll need to use the
Dispatch versions since Compile does not seem to like the /. x_?NumberQ
versions. However even then it still refuses to use the Compiled
version.

4)  Any remarks on m[4] vs m[5] From what I've been reading in the
posts of this group I really expected m[5] using dispatch to be much
faster - (see [mg12872] Re:Re:Efficient way to merge lists--Programming
help)

5)  Can anyone tweak these further?  Since m[?] will be used many, many
times in an iteration slight improvements will be significant.

Now some remarks about each function.  A requirement is that the
function m[*] should allow mi and ma to be initialised externally, i.e.
so we can loop over m[] passing different lists, l for the same il.
I've included initialisation inside m[*] inorder to facilitate building
the timing table...

m[1]:
This is Hartmuts earlier post altered to iterate over the whole of l,
and il.  It's good for small il and l, and in it's original form.

m[2]:
Scan over with an update function that replaces an element of a list of
maximums and minimums.

m[3]:
Same as m[2] except the update function now sets the value of a
variable rather than replace part of a list.  This seems to be the most
significant seed improvement.

-**** Defining u2 within the module for m[] speeds things slightly
again  ****-

m[4] thru m[7]:
These are variations using Dispatch for one /. and /@ instead of Scan.

TIA
Mark

(********** START CUT and PASTE HERE **************)

(*  Define a function to update the max and min values  *)

u1[{x_, y_, z___}, il_List] := Module[{pos, vals = {y, z}},
pos = Position[il, x][[1, 1]];
allmin[[pos]] = MapThread[Min, {Part[allmin, pos], vals}];
allmax[[pos]] = MapThread[Max, {Part[allmax, pos], vals}];
(*  End of Module  *)];

u2[{x_, y_, z___}] := Module[{omi = mi[x], oma = ma[x], vals = {y, z}},
(*  End of Module  *)];

m[1][il_List, l_List] :=
Module[{tl = 1, ti = 0, lenil = Length[il], lenl = Length[l], min,
max,vals, rl},
Off[Part::"partw"];
rl = {};
While[++ti <= lenil,
tl = 1;
min = Infinity {1, 1};
max = -min;

While[l[[tl, 1]] != il[[ti]],
tl++;
(* EoWhile l[[tl, 1]] != *)];

While[l[[tl, 1]] === il[[ti]] && tl <= lenl,
min = MapThread[Min, {min, vals = Rest[l[[tl]]]}];
tl++;
While[l[[tl, 1]]1 != il[[ti]],
tl++;(* EoWhile l[[tl, 1]] != *)];

(* EoWhile l[[tl, 1]] == *)];

rl = {rl, {il[[ti]], max, min}};

(* EoWhile ++ti *)];
On[Part::"partw"];
Partition[Flatten[rl], 5]]

m[2][il_List, l_List] := Module[{i = 1, j = 1, k = 1},
ClearAll[allmin, allmax];
Off[Set::"partw", Part::"partw"];
allmin = Table[{Infinity, Infinity}, {Length[il]}];
allmax = -allmin;
Scan[(u1[#, il]) &, l];
On[Set::"partw", Part::"partw"];
il /. x_?NumberQ :> {il[[i++]], allmax[[j++]], allmin[[k++]]}
(*  End of Module  *)]

m[3][il_List, l_List] := Module[{i = 1, j = 1, k = 1, tmpl},
Off[Set::"partw", Part::"partw"];
(mi[#] = {Infinity, Infinity}) & /@il;
(ma[#] = -{Infinity, Infinity}) & /@ il;
Scan[(u2[#]) &, l];
On[Set::"partw", Part::"partw"];
tmpl = il /. x_?NumberQ :> {x, ma[x], mi[x]};
ClearAll[mi, ma];
tmpl
(*  End of Module  *)]

m[4][il_List, l_List] := Module[{i = 1, j = 1, k, tmpl},
Off[Set::"partw", Part::"partw"];
r = Dispatch[# -> {#, ma[#], mi[#]} & /@ il];
(mi[#] = {Infinity, Infinity}) & /@ il;
(ma[#] = -{Infinity, Infinity}) & /@ il;
Scan[(u2[#]) &, l];
On[Set::"partw", Part::"partw"];
tmpl = il /. r;
ClearAll[r];
tmpl
(*  End of Module  *)]

m[5][il_List, l_List] := Module[{i = 1, j = 1, k, tmpl},
Off[Set::"partw", Part::"partw"];
r = Dispatch[# -> {#, ma[#], mi[#]} & /@ il];
(mi[#] = {Infinity, Infinity}) & /@ il;
(ma[#] = -{Infinity, Infinity}) & /@ il;
(u2[#]) & /@ l;
On[Set::"partw", Part::"partw"];
tmpl = il /. r;
ClearAll[mi, ma, r];
tmpl
(*  End of Module  *)]

m[6][il_List, l_List] := Module[{i = 1, j = 1, k, tmpl},
Off[Set::"partw", Part::"partw"];
(mi[#] = {Infinity, Infinity}) & /@ il;
(ma[#] = -{Infinity, Infinity}) & /@ il;
l /. {x_, y_, z_} :> (u2[{x, y, z}]);
tmpl = il /. x_?NumberQ :> {x, ma[x], mi[x]};
On[Set::"partw", Part::"partw"];
ClearAll[mi, ma, r];
tmpl
(*  End of Module  *)]

m[7][il_List, l_List] := Module[{i = 1, j = 1, k, tmpl},
Off[Set::"partw", Part::"partw"];
r = Dispatch[# -> {#, ma[#], mi[#]} & /@ il];
(mi[#] = {Infinity, Infinity}) & /@ il;
(ma[#] = -{Infinity, Infinity}) & /@ il;
l /. {x_, y_, z_} :> (u2[{x, y, z}]);
tmpl = il /. r;
On[Set::"partw", Part::"partw"];
ClearAll[mi, ma, r];
tmpl
(*  End of Module  *)]

(*  Generate data for timing table  *)

Function[{n},
timel[n] =
Table[{N[Random[Integer, {1, n/2}]], Random[Real, 9],
Random[Real, 9]}, {n}];] /@
{10, 50, 100, 1000, 3000, 10000};
Function[{n},
timeil[n] =
Union[First /@ timel[n]];] /@ {10, 50, 100, 1000, 3000, 10000};

(*  Generate Timing Table  *)

TableForm[
Table[ (m[i][timeil[#], timel[#]]; // Timing // First)/Second, {i, 2,
7} ] & /@ {10, 50, 100, 1000, 3000, 10000} ,
TableHeadings -> {{10, 50, 100, 1000, 3000, 10000},
Table[m[i], {i, 2, 7}]}]

(********** END CUT and PASTE HERE **************)

--
Mark Van De Vyver
Department of Accounting and Finance
University of Western Australia

mailto:mvdv at bigfoot.com
http://www.bigfoot.com/~mvdv

Sent via Deja.com http://www.deja.com/