MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Combining data from indexed lists efficiently

  • To: mathgroup at smc.vnet.net
  • Subject: [mg106218] Re: [mg106182] Combining data from indexed lists efficiently
  • From: DrMajorBob <btreat1 at austin.rr.com>
  • Date: Tue, 5 Jan 2010 01:46:51 -0500 (EST)
  • References: <201001041059.FAA21250@smc.vnet.net>
  • Reply-to: drmajorbob at yahoo.com

I think your output implies you had five initial lists, but you only  
listed 3.

Still... if I understand you correctly, here's a solution:

Clear[teach]
SetAttributes[teach, HoldFirst]
teach[f_Symbol, list_List] :=
  Module[{x},
   First@Last@
     Reap[(x = First@#; Head@f@x === f && (Sow@x; f[x] = Last@#)) & /@
       list]
   ]

list1 = {{"A", 1}, {"B", 2}, {"C", 3}, {"D", 4}};
list2 = {{"A", 5}, {"B", 6}, {"D", 7}, {"E", 8}};
list3 = {{"A", 9}, {"B", 10}, {"C", 11}};
lists = {list1, list2, list3};
functions = {f1, f2, f3};
Scan[Clear, functions]
indices = Intersection @@ teach @@@ Transpose@{functions, lists};
Transpose@Outer[Compose, functions, indices];
{#, Through[functions@#]} & /@ indices

{{"A", {1, 5, 9}}, {"B", {2, 6, 10}}}

Now f1, f2, and f3 are functions that retrieve values from the lists.

There are other ways... but this method MAY be faster on large lists.

(And maybe not.)

Bobby

On Mon, 04 Jan 2010 04:59:52 -0600, Steve W. Brewer <steve at take5.org>  
wrote:

> I have several lists of the format:
>
>     { {index1, value}, {index2, value}, ... {indexN, value} }
>
> For example:
>
>     list1 = { {"A", 1}, {"B",  2}, {"C",  3}, {"D", 4} }
>     list2 = { {"A", 5}, {"B",  6}, {"D",  7}, {"E", 8} }
>     list3 = { {"A", 9}, {"B", 10}, {"C", 11} }
>
> The indexes are not necessarily strings; they may be any expression.  (In
> the specific case I'm addressing now, each index is a list representing a
> date/time in the format returned by DateList[].)  The lists are not
> necessarily the same length.  Also, while most of the indexes appear in  
> all
> lists, there are some holes (missing data).
>
> I want to combine the lists into a single list of the format:
>
>     { { index1, {value1, value2, ... valueN} },
>       { index2, {value1, value2, ... valueN} },
>       ...
>       { indexN, {value1, value2, ... valueN} } }
>
> Only the data points with indexes appearing in all lists should be  
> included;
> the rest should be dropped.  Also, I want to include some derived values
> along with the original data values.
>
> Using the sample data above, let's say I want to include two derived  
> values
> from the functions:
>
>     f1[list1Data_, list2Data_] := list1Data + list2Data
>     f2[list2Data_, list3Data_] := list2Data + list3Data
>
> The result would be:
>
>     combinedList = { { "A", {1, 5,  9, 6, 14} },
>                      { "B", {2, 6, 10, 8, 16} } }
>
> I have a solution that works fine on "small" data sets. However, it's
> impractically slow on the "large" data sets I really need to run it on  
> (over
> 100k elements in each list).
>
> Here's what I'm doing now:
>
>
>     (* This part executes pretty quickly *)
>
>     indexesToUse =
>         Intersection[First /@ list1, First /@ list2, First /@ list3];
>
>     valueAtIndex[index_, list_] :=
>         Cases[list, {index, _}, 1, 1] // First // Last;
>
>     dataAtIndex[index_] := Block[
>         {v1, v2, v3, vf1, vf2},
>
>         v1 = valueAtIndex[index, list1];
>         v2 = valueAtIndex[index, list2];
>         v3 = valueAtIndex[index, list3];
>
>         vf1 = f1[v1, v2];
>         vf2 = f2[v2, v3];
>
>         {v1, v2, v3, vf1, vf2}
>     ];
>
>     (* This is where it bogs down *)
>
>     combinedList =
>         Function[{index}, {index, dataAtIndex[index]}] /@ indexesToUse;
>
>
> This is all inside an enclosing Module[] along with some other code, and  
> the
> actual code is a little more complex (e.g. more than three lists, more  
> than
> two derived-value functions).  The derived-value functions themselves are
> mostly simple algebra; I doubt they're the source of the bottleneck, and  
> in
> any case, I can't change them.  (I *can* change the way they're applied,
> though, if it makes a difference.)
>
> I *think* the bottleneck is probably in my repeated calls to Cases[] to  
> find
> particular data points, but that's just a guess.
>
> Is there a more efficient way of doing this that would speed things up
> significantly?
>
> Thanks!
>
>
> Steve W. Brewer
>
>


-- 
DrMajorBob at yahoo.com


  • Prev by Date: Re: Combining data from indexed lists efficiently
  • Next by Date: Re: algebraic numbers
  • Previous by thread: Re: Combining data from indexed lists efficiently
  • Next by thread: Re: Combining data from indexed lists efficiently