Re: a tricky List manipulation problem
- To: mathgroup at smc.vnet.net
- Subject: [mg16320] Re: [mg16211] a tricky List manipulation problem
- From: Carl Woll <carlw at fermi.phys.washington.edu>
- Date: Sun, 7 Mar 1999 01:05:44 -0500
- Sender: owner-wri-mathgroup at wolfram.com
How about the following approach.
merge[l1_, l2_] := Module[{common=Intersection[First/@l1,First/@l2]},
r1=Dispatch[Apply[Rule,l1,1];
r2=Dispatch[Apply[Rule,l2,1];
Transpose[{common,common/.r1,common/.r2}]]
For a large number of rules, Dispatch speeds up the application of the
rules. On my machine, with the following test lists:
n=10000;
l1=Union[Table[{i[Random[Integer,n]],y[Random[]]},{n}],
SameTest->(#1[[1]]===#2[[1]]&)];
l2=Union[Table[{i[Random[Integer,n]],z[Random[]]},{n}],
SameTest->(#1[[1]]===#2[[1]]&)];
merge[l1,l2] produced an answer in 2 seconds, and the time appeared to
scale linearly with the size of the data sets.
Carl Woll
Dept of Physics
U of Washington
On Fri, 5 Mar 1999 hanssen at zeiss.de wrote:
> Hi, MathGroup
>
> suppose I have two lists of lists which are structured like this:
>
> l1={{s1,x1},....,{si,xi}},
> l2={{t1,y1},....,{tk,yk}}.
>
> where the lengths of the two lists may be different. The first
> entries in the lists serve als "Tags", the second are "values".
> Suppose, no tag appears twice in l1 or twice in l2.
>
> I am looking for the list of elements {sh,xh,yh} of elements
> with the following conditions:
>
> {sh,xh} is element of l1 AND
> {sh,yh] is element of l2 (i.e. combine elements with same "tag")
>
> Without loss of generality suppose, both lists are sorted by their tags.
>
> A procedural approach to this problem would be to climb
> down the ladder starting with {i,j}={1,1}, comparing pairs l1[[i,1]] and l2[[j,1]]
> and taking found tag matches to a result list. This requires some caution,
> since tags from l1 may be missing in l2 and vice versa.
>
> My question is, if there is a better pattern matching or set
> manipulation approach to this problem, since the lists l1 and l2
> might become quite large. Timing for this is the main concern.
> I am looking for a solution which avoids any loop construct.
>
> Best regards
>
> Dipl.-Math. Adalbert Hanszen
>