[Date Index]
[Thread Index]
[Author Index]
Re: Re: Re: Re: aggregation of related elements in a list
danl at wolfram.com wrote:
SNIP>
>If I admit Floyd-Warshall was overkill and far too slow for this problem,
>can I play again?
>
>For set of m pairs giving connections between n elements, the code below
>has complexty O(n+m*log(m)). One can get rid of the log factor as the
>sorting is not strictly needed, but in practice it is not the bottleneck
>and actually tends to improve speed.
>
>aggregate[n_, pairs_] := Module[
> {hh, aggs, kk, ll, mm, spairs = Sort[Map[Sort, pairs]], fm},
> aggs = Map[hh, Range[n]];
> Do[
> {kk, mm} = spairs[[j]];
> ll = First[hh[kk]];
> fm = First[hh[mm]];
> If[fm === mm,
> hh[mm] = hh[ll],
> If[ll < fm,
> hh[mm] = hh[ll];
> hh[First[hh[fm]]] = hh[ll],
> hh[ll] = hh[mm]; ll = fm]];
> , {j, Length[spairs]}];
> Last[Reap[Do[ll = hh[j]; Sow[j, ll], {j, n}]]]
> ]
>
>The idea is to use a "marker function" (hh) to record the current smallest
>element related to the one in question. Note that if I set hh[7] tp hh[3],
>and later learn that hh[3] should become hh[1] because element 3 is found
>to be related to element 1, then the assignment hh[3]=hh[1] suffices to
>change it everywhere hh[3] may be found. So there is no backtracing
>needed. We loop once over the pairs to create associations, and once over
>the list elements to put associated elements into buckets.
>
>n = 1000;
>pairs = Table[Random[Integer, {1, n}], {n}, {2}];
>
>In[65]:= Sort[agg0[n,pairs]]===Sort[agg2[n,pairs]]===
> Sort[aggregate[n,pairs]]
>Out[65]= True
>
>We check the speed:
>
>n = 10000;
>pairs = Table[Random[Integer, {1, n}], {n}, {2}];
>
>In[59]:= {Timing[agg0[n,pairs];],Timing[agg2[n,pairs];],
> Timing[aggregate[n,pairs];]}
>Out[59]= {{4.593 Second,Null},{1.297 Second,Null},
> {0.297 Second,Null}}
>
>Now double the size:
>
>n = 20000;
>pairs = Table[Random[Integer, {1, n}], {n}, {2}];
>
>In[62]:= {Timing[agg0[n,pairs];],Timing[agg2[n,pairs];],
> Timing[aggregate[n,pairs];]}
>Out[62]= {{28.609 Second,Null},{5.657 Second,Null},
> {0.656 Second,Null}}
>
>
>Daniel Lichtblau
>Wolfram Research
>
>
>
I very much like this solution, which follows what one might do
manually, and is also efficient. It resembles the Flood-fill algorithm,
which is O(n^2), but by some Ma magic is only linear in n. Using the
existing function ConnectedComponents has the virtue of clarity from the
point of view of reading a program. In any case thanks again to all who
responded to my question.
Surely there will be no further improvements??
LP
Prev by Date:
**pattern matching question**
Next by Date:
**Re: Exporting notebooks to xml (cnxml)**
Previous by thread:
**Re: Re: Re: aggregation of related elements in a list**
Next by thread:
**Re: Re: aggregation of related elements in a list**
| |