AW: Appending to Lists

*To*: mathgroup at smc.vnet.net*Subject*: [mg27112] AW: [mg27045] Appending to Lists*From*: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.de>*Date*: Sun, 4 Feb 2001 02:58:40 -0500 (EST)*Sender*: owner-wri-mathgroup at wolfram.com

-----Ursprüngliche Nachricht----- Von: James Jones [mailto:j.k.jones at dl.ac.uk] Gesendet: Donnerstag, 1. Februar 2001 09:01 An: mathgroup at smc.vnet.net Betreff: [mg27045] Appending to Lists Hi, I have a function that creates a list (a) from another list (b). The list elements are re-grouped in the new list according to a third list (c). A Position command is applied to list (c) for an element, then with this output the list (a) is created from list (b) at positions given by the element position data, list (c). This is repeated for the large number of elements in the original lists. The Position command is necessary as different elements appear in the list a different number of times. However, with the large number of elements in the lists (approx 50,000 for a simple list), this method is _very_ slow. If any one can give me help in speeding this process up I would be very grateful. The data sets would look like this b c 0.2 1 0.6 2 1.2 3 -0.2 1 0.5 2 0.3 1 0.7 2 -0.2 1 -0.6 1 A List would then be created from this data ( the list (a) ) containing vectors for 1, 2 and 3. The data in (b) is not important, and the order in which elements in (c) drop out is not set. In this case the (a) list should look like a = { { 0.2, -0.2, -0.2, -0.6} , {0.6, 0.5, 0.7} , { 1.2 } } My current function looks like this Do[AppendTo[xfinal, Flatten[Part[X, #] & /@ Position[Global`PARTICLE, i]]], {i, 1, Max[PARTICLE]}]; where xfinal is an (a) list, i.e. to be created. X is the (b) list , i.e. to be addressed, and PARTICLE is the (c) list. It is referenced by number. and it is very slow! Also, after producing this list, the different vector elements need to be made the same length, and so 0.0 are added to the ends of all vector elements shorter than the longest. My current function for doing this looks like table = Table[0.0, {Length[First[long]]}]; Print["Table Created!"]; Do[If[Length[Part[xfinal, i]] < Length[First[long]], AppendTo[Part[xfinal, i], Drop[table, (Length[Part[xfinal, i]])] ]], {i, 2, Length[xfinal]}]; where list (long) just sorts the list elements according to length. This function is also very slow, and I was wondering, again, if anyone knew a faster way of implementing this. Is the production of a table, once, and then dropping bits off and appending the fastest method? Of course this needs to be done tens of thousands of times per set of data so any small speed increase would be very helpful ;-> Again, any help much appreciated, James Jones Daresbury Laboratory Dear Jones, here a proposal, how to do things eficient in input size: b = {0.2, 0.6, 1.2, -0.2, 0.5, 0.3, 0.7, -0.2, -0.6}; c = {1, 2, 3, 1, 2, 1, 2, 1, 1}; Transpose[{c, b}] Out[]= {{1, 0.2}, {2, 0.6}, {3, 1.2}, {1, -0.2}, {2, 0.5}, {1, 0.3}, {2, 0.7}, {1, -0.2}, {1, -0.6}} your measured values are now tagged with the particle (or is it a counter?) Sorting these, means grouping of measurements to p.or c. rawl = Sort@Transpose[{c, b}] Out[]= {{1, -0.6}, {1, -0.2}, {1, -0.2}, {1, 0.2}, {1, 0.3}, {2, 0.5}, {2, 0.6}, {2, 0.7}, {3, 1.2}} split makes this explicit in structure Split[Sort@Transpose[{c, b}], First[#1] === First[#2] &] Out[]= {{{1, -0.6}, {1, -0.2}, {1, -0.2}, {1, 0.2}, {1, 0.3}}, {{2, 0.5}, {2, 0.6}, {2, 0.7}}, {{3, 1.2}}} Now we dispose off the tags l = Map[Last, Split[Sort@Transpose[{c, b}], First[#1] === First[#2] &], {2}] Out[]= {{-0.6, -0.2, -0.2, 0.2, 0.3}, {0.5, 0.6, 0.7}, {1.2}} We get the maximum length: maxlen = Max @@ Length /@ l Out[]=5 This makes all records to equal length (however I wonder why?) PadRight[#, maxlen, 0.] & /@ l Out[]= {{0.2, -0.2, 0.3, -0.2, -0.6}, {0.6, 0.5, 0.7, 0., 0.}, {1.2, 0., 0., 0., 0.}} You obviously lost information through this procedure. Alternatively you may group your events to p.or c. e.g.like this With[{t = Table[0. , {3}]}, ReplacePart[t, #2, #1] & @@@ rawl]// Transpose Out[]= {{-0.6, -0.2, -0.2, 0.2, 0.3, 0., 0., 0., 0.}, {0., 0., 0., 0., 0., 0.5, 0.6, 0.7, 0.}, {0., 0., 0., 0., 0., 0., 0., 0., 1.2}} You might like to check whether this expression is more adequate for further processing (case of the *un*paded result isn't) Test for performance 3 Particles or Counters, 100,000 events: In[6]:= nWhat = 3; nEvents = 100000; In[7]:= btest = Table[Random[Real, {-1, 1}], {nEvents}]; In[8]:= ctest = Table[Random[Integer, {1, nWhat}], {nEvents}]; In[9]:= Timing[With[{ltest = Flatten[Part[btest, #] & /@ Position[ctest, #]] & /@ Range[nWhat]}, With[{maxlen = Max @@ Length /@ ltest}, PadRight[#, maxlen, 0.] & /@ ltest]]][[1]] Out[9]= 1.392 Second In[10]:= Timing[With[{t0 = Table[0., {nWhat}]}, ReplacePart[t0, #2, #1] & @@@ Transpose[{ctest, btest}]] // Transpose][[1]] Out[10]= 2.253 Second Test 30 Particles or Counters, 100,000 events In[20]:= nWhat = 30; nEvents = 100000; ... In[23]:= Timing[With[{ltest = Flatten[Part[btest, #] & /@ Position[ctest, #]] & /@ Range[nWhat]}, With[{maxlen = Max @@ Length /@ ltest}, PadRight[#, maxlen, 0.] & /@ ltest]]][[1]] Out[23]= 3.465 Second In[24]:= Timing[With[{t0 = Table[0., {nWhat}]}, ReplacePart[t0, #2, #1] & @@@ Transpose[{ctest, btest}]] // Transpose][[1]] Out[24]= 4.236 Second Test 300 Particles or Counters, 30,000 events The alternative eats up more memory, so I reduced the # of events here (for my notebook computer. Yet both algorithms are clearly linear in # of events) In[25]:= nWhat = 300; nEvents = 30000; ... In[28]:= Timing[With[{ltest = Flatten[Part[btest, #] & /@ Position[ctest, #]] & /@ Range[nWhat]}, With[{maxlen = Max @@ Length /@ ltest}, PadRight[#, maxlen, 0.] & /@ ltest]]][[1]] Out[28]= 9.564 Second In[29]:= Timing[With[{t0 = Table[0., {nWhat}]}, ReplacePart[t0, #2, #1] & @@@ Transpose[{ctest, btest}]] // Transpose][[1]] Out[29]= 4.857 Second so you see the alternative behaves better for larger numbers of particles (or was it counters?) Test 1000 Particles or Counters, 10,000 events In[30]:= nWhat = 1000; nEvents = 10000; ... In[33]:= Timing[With[{ltest = Flatten[Part[btest, #] & /@ Position[ctest, #]] & /@ Range[nWhat]}, With[{maxlen = Max @@ Length /@ ltest}, PadRight[#, maxlen, 0.] & /@ ltest]]][[1]] Out[33]= 16.193 Second In[34]:= Timing[With[{t0 = Table[0., {nWhat}]}, ReplacePart[t0, #2, #1] & @@@ Transpose[{ctest, btest}]] // Transpose][[1]] Out[34]= 5.238 Second So, if you have enough real(!) memory, the alternative is clearly superior, and still contains all information -- Hartmut Wolf