MathGroup Archive: October 2005 [00871]

[Date Index] [Thread Index] [Author Index]

Re: aggregation of related elements in a list

To: mathgroup at smc.vnet.net
Subject: [mg61730] Re: aggregation of related elements in a list
From: Maxim <ab_def at prontomail.com>
Date: Thu, 27 Oct 2005 05:02:11 -0400 (EDT)
References: <djcgcs$6du$1@smc.vnet.net> <djfnjr$b1i$1@smc.vnet.net>
Sender: owner-wri-mathgroup at wolfram.com

On Sun, 23 Oct 2005 10:11:07 +0000 (UTC), Carl K. Woll  
<carlw at u.washington.edu> wrote:

> leigh pascoe wrote:
>> Dear Mathgroup,
>>
>> I would like to construct a function to transform a list of integers
>> into a list of lists of related members. Suppose we have the finite list
>>
>> {x1,x2,x3,..........xN}
>>
>> and a test function ( relatedQ[x_,y_,n_] ) that decides whether two
>> elements are related. The actual test may vary, but returns True or
>> False for any two elements as a function of n (n is an independent
>> variable not the number of elements in the list}.
>>
>> I would like to process this list to aggregate those members that are
>> related either directly to each other or to a common element. For
>> example if x1 is unrelated to any other element in the list, x2 is
>> related to x5 but not x7 and x5 is related to x7 etc. would produce the
>> list of lists
>>
>> {{x1},{x2,x5,x7),.......}
>>
>> To take a specific example, assume we have the list
>>
>> list={1,2,6,7,8,9,11,16}
>>
>> and that application of the relatedQ, test to all possible pairs yields
>> the list of related pairs
>>
>> prlist={{1,2},{1,6},{2,6},{2,7},{6,7},{6,11},{7,8},{7,11},{11,16}}
>>
>> Using the list and the list of related pairs we can deduce the desired
>> list of lists as
>>
>> {{1,2,6,7,8,11,16},{9}}
>>
>
> Here's one suboptimal method. I give steps tailored for your example. It
> shouldn't be too difficult encapsulate these steps into a nice function.
> Here are the steps. Construct a transition matrix:
>
> sp1 = SparseArray[Thread[prlist -> 1], {16, 16}];
> sp2 = SparseArray[{#, #} -> 1 & /@ list, {16, 16}];
>
> In[13]:=
> t = sp1 + sp2 + Transpose[sp1]
> Out[13]=
> SparseArray[<24>, {16, 16}]
>
> Multiply t by itself (and take it's sign) until the result doesn't  
> change:
>
> In[14]:=
> FixedPoint[Sign[#.#]&,t]
> Out[14]=
> SparseArray[<50>, {16, 16}]
>
> The different rows of the fixed point are the different sets you are
> interested in:
>
> In[15]:=
> Union[Normal[%14]]
> Out[15]=
> {{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
>
> {0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
>
> {1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1}}
>
> If you prefer a list of integer sets:
>
> In[16]:=
> Flatten[Position[#,1]]&/@%15
> Out[16]=
> {{}, {9}, {1, 2, 6, 7, 8, 11, 16}}
>
>> If 6 were not in the original list we would get instead
>>
>> {{1,2},{7,8,11,16},{9}}
>>
>
> Since your original pair list had the pair {2,7}, I think the result  
> should
> still be {{1,2,7,8,11,16},{9}}. At any rate, to use the transition matrix
> approach, you would need to delete all pairs that had elements not in  
> your
> list, and then look for the fixed point. In your example with 6 deleted  
> from
> list, you would also need to delete all 6s from prlist:
>
> newprlist = DeleteCases[prlist,{6,_}|{_,6}]
>
> and then proceed as before using newprlist instead of prlist. This method
> does extra work based on how large your final integer sets are. However,
> matrix multiplication of sparse matrices is extremely quick, so I think  
> it
> ought to be at least competitive with other approaches.
>
> Carl Woll
> Wolfram Research
>

If I understand the problem correctly, this amounts to finding connected  
components in a graph -- there was a similar question asked on MathGroup a  
while ago.

In[1]:=
<<discretemath`
agg[L_, $Lpair_] := Module[
   {Lpair = $Lpair, LLind},
   Lpair = Lpair /. Thread[L -> Range@ Length@ L];
   LLind = ConnectedComponents[
     FromUnorderedPairs[Lpair, CircularEmbedding[Length@ L]]];
   Extract[L, List /@ #]& /@ LLind
]

In[3]:=
agg[{1, 2, 6, 7, 8, 9, 11, 16},
     {{1, 2}, {1, 6}, {2, 6}, {2, 7}, {6, 7},
      {6, 11}, {7, 8}, {7, 11}, {11, 16}}]

Out[3]=
{{1, 2, 6, 7, 8, 11, 16}, {9}}

In[4]:=
agg2[L_, Lpair_] := Module[
   {n = Max@ L, sp1, sp2, t},
   sp1 = SparseArray[Thread[Lpair -> 1], {n, n}];
   sp2 = SparseArray[{#, #} -> 1& /@ L, {n, n}];
   t = sp1 + sp2 + Transpose@ sp1;
   Flatten@ Position[#, 1]& /@ Union@ Normal@
       FixedPoint[Sign[#.#]&, t, SameTest -> Equal] //
     DeleteCases[#, {}]&
]

In[5]:=
n = 500;
pairs = Array[Random[Integer, {1, n}]&, {n, 2}];
(ans = agg[Range@ n, pairs];) // Timing
(ans2 = agg2[Range@ n, pairs];) // Timing
Sort@ ans === Sort@ ans2

Out[7]= {0.062 Second, Null}

Out[8]= {2.469 Second, Null}

Out[9]= True

The matrix multiplication method slows down significantly on the last  
iterations of FixedPoint. For this choice of pairs there will typically be  
one large connected subgraph with about 400 vertices, so the sparse array  
will ultimately contain more than 400^2 non-zero elements and Dot won't be  
as fast.

A subtle point is that SameQ checks only the structural equivalence of  
SparseArray expressions. A sparse array contains information about the  
positions of the non-zero elements, but those positions can be stored out  
of order -- it can be {{2}, {1}} as well as {{1}, {2}}, and then SameQ  
will give False. Here's an example:

agg2[{1, 2, 3, 4}, {{1, 4}, {4, 3}, {3, 1}}]

If we use FixedPoint with SameTest -> Automatic, it will loop infinitely.

Maxim Rytin
m.r at inbox.ru

Follow-Ups:
- Re: Re: aggregation of related elements in a list
  - From: "Carl K. Woll" <carlw@wolfram.com>

Prev by Date: Re: MLPutRealList vs. sequence of MLPutDouble

Next by Date: spectra

Previous by thread: Re: Re: aggregation of related elements in a list

Next by thread: Re: Re: aggregation of related elements in a list