       Re: Re: Re: Re: aggregation of related elements in a list

```danl at wolfram.com wrote:

SNIP>

>If I admit Floyd-Warshall was overkill and far too slow for this problem,
>can I play again?
>
>For set of m pairs giving connections between n elements, the code below
>has complexty O(n+m*log(m)). One can get rid of the log factor as the
>sorting is not strictly needed, but in practice it is not the bottleneck
>and actually tends to improve speed.
>
>aggregate[n_, pairs_] := Module[
>  {hh, aggs, kk, ll, mm, spairs = Sort[Map[Sort, pairs]], fm},
>  aggs = Map[hh, Range[n]];
>  Do[
>   {kk, mm} = spairs[[j]];
>   ll = First[hh[kk]];
>   fm = First[hh[mm]];
>   If[fm === mm,
>    hh[mm] = hh[ll],
>    If[ll < fm,
>     hh[mm] = hh[ll];
>     hh[First[hh[fm]]] = hh[ll],
>     hh[ll] = hh[mm]; ll = fm]];
>   , {j, Length[spairs]}];
>  Last[Reap[Do[ll = hh[j]; Sow[j, ll], {j, n}]]]
>  ]
>
>The idea is to use a "marker function" (hh) to record the current smallest
>element related to the one in question. Note that if I set hh tp hh,
>and later learn that hh should become hh because element 3 is found
>to be related to element 1, then the assignment hh=hh suffices to
>change it everywhere hh may be found. So there is no backtracing
>needed. We loop once over the pairs to create associations, and once over
>the list elements to put associated elements into buckets.
>
>n = 1000;
>pairs = Table[Random[Integer, {1, n}], {n}, {2}];
>
>In:= Sort[agg0[n,pairs]]===Sort[agg2[n,pairs]]===
>  Sort[aggregate[n,pairs]]
>Out= True
>
>We check the speed:
>
>n = 10000;
>pairs = Table[Random[Integer, {1, n}], {n}, {2}];
>
>In:= {Timing[agg0[n,pairs];],Timing[agg2[n,pairs];],
>  Timing[aggregate[n,pairs];]}
>Out= {{4.593 Second,Null},{1.297 Second,Null},
>  {0.297 Second,Null}}
>
>Now double the size:
>
>n = 20000;
>pairs = Table[Random[Integer, {1, n}], {n}, {2}];
>
>In:= {Timing[agg0[n,pairs];],Timing[agg2[n,pairs];],
>  Timing[aggregate[n,pairs];]}
>Out= {{28.609 Second,Null},{5.657 Second,Null},
>  {0.656 Second,Null}}
>
>
>Daniel Lichtblau
>Wolfram Research
>
>
>
I very much like this solution, which follows what one might do
manually, and is also efficient. It resembles the Flood-fill algorithm,
which is O(n^2), but by some Ma magic is only linear in n. Using the
existing function ConnectedComponents has the virtue of clarity from the
point of view of reading a program. In any case thanks again to all who
responded to my question.

Surely there will be no further improvements??

LP

```

• Prev by Date: pattern matching question
• Next by Date: Re: Exporting notebooks to xml (cnxml)
• Previous by thread: Re: Re: Re: aggregation of related elements in a list
• Next by thread: Re: Re: aggregation of related elements in a list