Re: corrected RE: Re: Computing sets of equivalences

*To*: mathgroup at smc.vnet.net*Subject*: [mg46482] Re: corrected RE: [mg46459] Re: Computing sets of equivalences*From*: "Carl K. Woll" <carl at woll2woll.com>*Date*: Fri, 20 Feb 2004 00:29:26 -0500 (EST)*References*: <8EB1C3634596D6118952006008F711CD02B60144@dassne01.darmstadt.dsh.de>*Sender*: owner-wri-mathgroup at wolfram.com

Hartmut and other interested readers, > Carl, > > I managed to repair my proposal. The problem was to unify classes if they > are linked later. I do that now "repeating the trick": > > classesFrom3[equis_] := Module[{nex, ext, class}, > nex[_] := True; > ext[_] := {}; > If[nex[#1], > If[nex[#2], nex[#1] = nex[#2] = class[Min[#1, #2]], , > nex[#1] = nex[#2]], , > If[nex[#2], nex[#2] = nex[#1], , > Evaluate[nex[Max[#1, #2]]] = nex[Min[#1, #2]]]] & @@@ equis; > nex[_] =.; > class[i_] = i; > Scan[(ext[nex[#]] = {ext[nex[#]], #}) &, > DownValues[nex][[All, 1, 1, 1]]]; > ext[_] =.; > Flatten /@ DownValues[ext][[All, 2]] > ] > > Of course that costs me something, the advantage compared to yours now has > now diminshed to the cubic root of 2. > Our algorithms are very similar of course, so it's not surprising that they both scale as O(n). Upon further investigation, it seems your method of handling the pointers from the members to the correct class (nex) was faster than mine. However, I believe that processing the equivalence sets on the go, rather than doing a scan as you do after the pointers are created, is a bit faster. The net effect is that your algorithm was a bit faster. > > > Here is also a method to produce test data: > > In[337]:= > cycs = Union[Prepend[Table[Random[Integer, {1, 30}], {5}], 0]] > > Out[337]= {0, 13, 20, 24, 25, 30} > > In[341]:= > rngs = DeleteCases[{1, 0} + # & /@ Partition[cycs, 2, 1], {x_, x_}] > Out[341]= {{1, 13}, {14, 20}, {21, 24}, {26, 30}} > > In[342]:= > prob = Join @@ (Join[Partition[Range[#1, #2], 2, 1], > Block[{rnd}, > rnd := Random[Integer, {#1, #2}]; {Take[ > NestWhileList[rnd &, rnd, (#1 === #2 &), 2], -2]}]] & > @@@ > rngs); > Note that your method of producing data creates duplicate equivalences. > In[343]:= eqs = prob[[RandomPermutation[Length[prob]]]]; > > In[344]:= classesFrom3[eqs] > Out[344]= > {{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13}, > {14, 15, 16, 17, 18, 19, 20}, > {21, 22, 23, 24}, {26, 27, 28, 29, 30}} > > In[345]:= Range @@@ rngs > Out[345]= > {{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13}, > {14, 15, 16, 17, 18, 19, 20}, > {21, 22, 23, 24}, {26, 27, 28, 29, 30}} > > > > Your program gets recursion problems with that data, perhaps you may arrage > this with some preordering (but you'll loose O[n] then). > As I stated in my original post, my algorithm did not handle duplicate equivalences, which are contained in your test data. However, as I also stated, it is simple to fix my algorithm to be able to handle this case, with an imperceptible difference in timing. > > Both programs from Mariusz give wrong answers. Also that from Jens-Peer. The > program from Ingolf Dahl gives the same answer as mine, in a different > ordering. But of course it is not competitive with the performance. It is > however a nice example, to produce a (possible slow but) provable correct > solution first! > I believe that Mariusz' second program gives correct answers as long as the Union of all the integers yields a set consisting of all integers from 1 to the highest number. Your test data is missing 25. Perhaps Mariusz never intended to find equivalences where some integers may be missing, and so he would be perfectly satisfied with his approach. At any rate, inspired by your challenge, I've come up with a much quicker algorithm. addequiv[a_, a_] := 1 addequiv[ptr[a_], ptr[b_]] := (ptr[a] = ptr[b] = class[a]; equivset[class[a]] = {a, b}) addequiv[ptr[a_], b_class] := (ptr[a] = b; equivset[b] = {equivset[b], a}) addequiv[a_class, ptr[b_]] := (ptr[b] = a; equivset[a] = {equivset[a], b}) addequiv[a_class, b_class] := (equivset[a] = {equivset[a], equivset[b]}; equivset[b] =.; b = a) getequivs[eq_] := Block[{ptr, class, equivset}, Apply[addequiv, Map[ptr, eq, {2}], {1}]; Flatten /@ DownValues[equivset][[All, 2]]] The above approach handles the pointer processing much more quickly because it uses Mathematica's built in pattern matcher to do the If statements. I believe that creating the equivalence sets on the go is also quicker. One could modify this algorithm to process the DownValues of ptr as you did the DownValues of nex, but I believe that would be slower. The net effect is that the above approach seems to be 40% (or more) faster than yours. Note that the addequiv[a_,a_]:=1 definition allows the program to skip duplicate equivalences. It's been a while since there's been an interesting (at least to me) algorithmic challenge like this for us to wrestle with. Carl > > Hartmut > > > > > > >-----Original Message----- > >From: Wolf, Hartmut To: mathgroup at smc.vnet.net > >Sent: Thursday, February 19, 2004 4:24 PM > >To: 'Carl K. Woll' > >Cc: 'Mariusz Jankowski' > >Subject: [mg46482] Re: RE: [mg46459] Re: Computing sets of equivalences > > > > > >Yes, Carl, > > > >just saw it, re-reading my explanations, it's wrong! As are > >both solution from Marisz, not yours. > > > >Sorry, Hartmut > > > > > >>-----Original Message----- > >>From: Wolf, Hartmut To: mathgroup at smc.vnet.net > >>Sent: Thursday, February 19, 2004 4:07 PM > >>To: 'Carl K. Woll'; mathgroup at smc.vnet.net > >>Cc: 'Mariusz Jankowski' > >>Subject: [mg46482] RE: [mg46459] Re: Computing sets of equivalences > >> > >> > >>Dear Carl, > >> > >>miserably I missed the challenge yesterday. I tried today > >>without looking at the solutions, and came up with something > >>which appears to be similiar to yours (no wonder I learnt that > >>from you!). > >> > >>It is robust against ordering and repetitions (and is faster > >>than yours by a square root of 2). In fact it is 2 solutions, > >>the first one is this > >> > >> > >>classesFrom[equis_] := > >> Module[{objs = Union[Flatten[equis]], nex, ext, classes}, > >>nex[_] := True; > >> ext[_] := {}; > >> classes = > >> Union[If[nex[#1], > >> If[nex[#2], nex[#1] = nex[#2] = #1, , nex[#1] = > >>nex[#2]], , > >> If[nex[#2], nex[#2] = nex[#1], , nex[#1]]] & @@@ equis]; > >> Scan[(ext[nex[#]] = {ext[nex[#]], #}) &, objs]; > >> Flatten[ext[#]] & /@ classes > >> ] > >> > >>and derived from that, as obj and classes can be replaced by > >>DownValues, this one: > >> > >>classesFrom2[equis_] := Module[{nex, ext}, > >> nex[_] := True; > >> ext[_] := {}; > >> If[nex[#1], If[nex[#2], nex[#1] = nex[#2] = #1, , nex[#1] > >>= nex[#2]], , > >> If[nex[#2], nex[#2] = nex[#1], ,]] & @@@ equis; > >> nex[_] =.; > >> Scan[(ext[nex[#]] = {ext[nex[#]], #}) &, > >> DownValues[nex][[All, 1, 1, 1]]]; > >> ext[_] =.; > >> Flatten /@ DownValues[ext][[All, 2]] > >> ] > >> > >>This avoids Union, but is only marginally faster, such O[n] vs > >>O[n log n] seems not to be a practical issue here. > >> > >>The algorithmic idea is quite simple: if for an associated > >>pair both entries have not been encountered, a new equivalence > >>class is created (which gets its name from the first object) > >>and both objects are linked to that class by definitions for > >>nex[obj]. If only one object has not been encountered, then > >>it is linked to the class the other obj refers to. If both > >>have been encountered we simply pass on. > >> > >>Done that, we collect the classes passing through all objects > >>(represented by the DownValues of nex) and finally display the > >>classes (represented by the DownValues of ext) forgetting their names. > >> > >>yours, > >>Hartmut > >> > >>--- > >>P.S. I didn't look into the books, but this seems to be a > >>classical problem; perhaps it's somewhere in the AddOns. > >> > >> > >> > >[...] > > > >