Re: "set" data structure in Mathematica? (speeding up graph traversal function)
- To: mathgroup at smc.vnet.net
- Subject: [mg117603] Re: "set" data structure in Mathematica? (speeding up graph traversal function)
- From: DrMajorBob <btreat1 at austin.rr.com>
- Date: Thu, 24 Mar 2011 06:28:55 -0500 (EST)
I'm always hitting "Send" too fast! The last solution (my favorite) should have been: Clear[next, path, p] d = Dispatch@graph; nodes = Union@Flatten[graph /. Rule -> List]; next[i_Integer] := next[i] = ReplaceList[i, d] path[i_Integer] := path@p@i path[p[a___, last_]] := Module[{c = Complement[next@last, {a, last}]}, If[c == {}, p[a, last], path@p[a, last, #] & /@ c]] (paths = (path /@ nodes); Length@Flatten@paths) // Timing {1.60494, 47165} That could also start with nodes = Union@graph[[All, 1]]; In that case, two "singleton" paths are eliminated. Bobby On Wed, 23 Mar 2011 15:06:39 -0500, DrMajorBob <btreat1 at austin.rr.com> wrote: >> Another change was the use of Cases in place of ReplaceList, which is >> mostly responsible for the great difference in timing you get. > > No, it isn't. "next" is computed exactly ONCE for each node, so this is > all the time involved: > > Clear[next] > nodes = Union@Flatten[graph /. Rule -> List]; > next[i_Integer] := next[i] = Cases[graph, Rule[i, j_] :> j] > next /@ nodes; // Timing > > {0.001473, Null} > > Here's the same thing with ReplaceList: > > Clear[next] > nodes = Union@Flatten[graph /. Rule -> List]; > next[i_Integer] := next[i] = ReplaceList[i, graph] > next /@ nodes; // Timing > > {0.002368, Null} > > And here it is with Dispatch, in two different ways: > > Clear[next] > nodes = Union@Flatten[graph /. Rule -> List]; > next[i_Integer] := next[i] = ReplaceList[i, Dispatch@graph] > next /@ nodes; // Timing > > {0.001981, Null} > > Clear[next] > d = Dispatch@graph; > nodes = Union@Flatten[graph /. Rule -> List]; > next[i_Integer] := next[i] = ReplaceList[i, d] > next /@ nodes; // Timing > > {0.000291, Null} > > The last is a lot faster RELATIVELY, but not in absolute terms -- unless > your plan is to eliminate "next" entirely: > > Clear[path, p] > d = Dispatch@graph; > path[i_Integer] := path@p@i > path[p[a___, last_]] := > Module[{c = Complement[ReplaceList[last, d], {a, last}]}, > If[c == {}, > p[a, last], > path@p[a, last, #] & /@ c > ] > ] > (paths = (path /@ nodes); Length@Flatten@paths) // Timing > > {1.7552, 47165} > > But that's no better than this: > > Clear[next, path, p] > nodes = Union@Flatten[graph /. Rule -> List]; > next[i_Integer] := next[i] = Cases[graph, Rule[i, j_] :> j] > path[i_Integer] := path@p@i > path[p[a___, last_]] := > Module[{c = Complement[next@last, {a, last}]}, > If[c == {}, > p[a, last], > path@p[a, last, #] & /@ c > ] > ] > ((paths = Flatten[path /@ nodes] /. p -> List) // Length) // Timing > > {1.74741, 47165} > > or this: > > Clear[next, path, p] > d = Dispatch@graph; > nodes = Union@Flatten[graph /. Rule -> List]; > next[i_Integer] := next[i] = ReplaceList[i, d] > path[i_Integer] := path@p@i > path[p[a___, last_]] := > Module[{c = Complement[ReplaceList[last, d], {a, last}]}, > If[c == {}, p[a, last], path@p[a, last, #] & /@ c]] > (paths = (path /@ nodes); Length@Flatten@paths) // Timing > > {1.74914, 47165} > > I like that solution best. > > Bobby > > On Wed, 23 Mar 2011 13:58:37 -0500, Szabolcs Horv=E1t <szhorvat at gmail.com> > wrote: > >> Thank you for reading the lengthy post and replying, Bobby! >> >> Essentially what you are doing is replacing several calls to MemberQ >> by a single call to Complement. This indeed works well! >> >> Another change was the use of Cases in place of ReplaceList, which is >> mostly responsible for the great difference in timing you get. >> However, using Dispatch[graph] instead of just graph in ReplaceList >> appears to be faster than Cases. This can be combined with your idea >> to use Complement to get another speedup. >> >> Cheers, >> Szabolcs >> >> 2011/3/22 DrMajorBob <btreat1 at austin.rr.com>: >>> Here's an improvement, I think. >>> >>> For the test graph you listed below, I get >>> >>> Clear[next, path, p, listPath] >>> nodes = Union@Flatten[graph /. Rule -> List]; >>> next[i_Integer] := next[i] = Cases[graph, Rule[i, j_] :> j] >>> path[i_Integer] := path@p@i >>> path[p[a___, last_]] := >>> Module[{c = Complement[next@last, {a, last}]}, >>> If[c == {}, >>> p[a, last], >>> Flatten[path@p[a, last, #] & /@ c] >>> ] >>> ] >>> (paths = (path /@ nodes); Length@Flatten[pathways]) // Timing >>> >>> {1.73881, 405496} >>> >>> Omitting Flatten and using p -> List can get a structure like yours >>> (mostly): >>> >>> Clear[next, path, p, listPath] >>> nodes = Union@Flatten[graph /. Rule -> List]; >>> next[i_Integer] := next[i] = Cases[graph, Rule[i, j_] :> j] >>> path[i_Integer] := path@p@i >>> path[p[a___, last_]] := >>> Module[{c = Complement[next@last, {a, last}]}, >>> If[c == {}, >>> p[a, last], >>> Flatten[path@p[a, last, #] & /@ c] >>> ] >>> ] >>> ((paths = Flatten[path /@ nodes] /. p -> List) // Length) // Timing >>> >>> {1.85056, 47165} >>> >>> Compare the above with >>> >>> Clear[traverse] >>> traverse[path_][node_] := >>> With[{l = ReplaceList[node, graph], p = Append[path, node]}, >>> If[l === {}, {p}, >>> If[MemberQ[path, node], {}, Join @@ (traverse[p] /@ l)]]] >>> pathways = traverse[{}] /@ nodes; // Timing >>> >>> {8.09828, Null} >>> >>> paths == Cases[pathways, {__Integer}, Infinity] >>> >>> True >>> >>> Bobby >>> >>> On Tue, 22 Mar 2011 05:06:19 -0500, Szabolcs Horv=E1t >>> <szhorvat at gmail.com> >>> wrote: >>> >>>> Dear MathGroup, >>>> >>>> I am trying to speed up a function. >>>> >>>> I'm looking for an efficient set-like data structure, to be used as >>>> follows: >>>> >>>> I have a recursive function f, which works like this (omitting the >>>> details): >>>> >>>> f[set_][value_] := >>>> if value in set then >>>> return nothing >>>> otherwise >>>> newSet = insert value into set >>>> return f[newSet] /@ (list of computed values) >>>> >>>> In other words, the function calculates a list of values, then calls >>>> itself recursively with each of them. These values are collected >>>> into a >>>> set during the recursive calls, and one of the stopping conditions is >>>> that a value gets repeated. >>>> >>>> Note: of course there's another stopping condition as well, otherwise >>>> the function wouldn't return anything, but that is irrelevant here. >>>> >>>> Also note: each branch of the recursion will have its own collection >>>> (set) of values. >>>> >>>> The trivial solution would be just using a List of values, inserting >>>> new >>>> values with Append, and testing whether they're already in the list >>>> with >>>> MemberQ. I found that the function spends most of its time in >>>> MemberQ, >>>> so I am looking for a more efficient solution than a plain list that >>>> needs to be iterated over to find elements in it. >>>> >>>> I tried using newSet = Union[set, {value}] instead of append, and >>>> testing whether Length[newSet] === Length[set], but this turned out to >>>> be even slower. >>>> >>>> I wonder if there's a better solution. >>>> >>>> One way would be using "function definitions", i.e. "inserting" >>>> 'value' >>>> into 'set' as set[value] = True, and testing membership simply as >>>> TrueQ[set[value]], but this object can't easily be passed around in a >>>> recursive function because it is essentially a global variable. Note >>>> again that the recursion has several branches as the function is >>>> mapped >>>> over a whole list of values. >>>> >>>> Does anyone have an idea how to solve this problem? >>>> >>>> -- Szabolcs >>>> >>>> P.S. The problem I'm trying to solve is to find (actually just count) >>>> ALL acyclic paths in a directed graph, starting from a given node. >>>> Starting from a node, I descend along the connections until there's no >>>> way to go, or until I encounter a node that I have already visited. >>>> >>>> Perhaps it's better if I post the function ... >>>> >>>> traverse[graph_][path_][node_] := >>>> With[ >>>> {l = ReplaceList[node, graph], >>>> p = Append[path, node]}, >>>> If[l === {}, >>>> {p}, >>>> If[ >>>> MemberQ[path, node], >>>> {}, >>>> Join @@ (traverse[graph][p] /@ l) >>>> ] >>>> ] >>>> ] >>>> >>>> graph is a list of rules representing a simple directed graph, path is >>>> the list of already visited nodes (initially {}), node is the starting >>>> node. It's generally a good idea to pass the graph as a dispatch >>>> table >>>> (Dispatch[]), but then MemberQ[] takes even longer to evaluate than >>>> ReplaceList[]. >>>> >>>> If there are cycles in the graph, there will be a large number of >>>> paths, >>>> and the function will be slow. >>>> >>>> P.P.S >>>> >>>> Here's a small graph with some loops to test with: >>>> >>>> {1 -> 2, 3 -> 4, 5 -> 1, 5 -> 2, 6 -> 2, 6 -> 5, 7 -> 2, 7 -> 6, >>>> 8 -> 1, 8 -> 2, 8 -> 5, 8 -> 6, 9 -> 2, 9 -> 5, 9 -> 6, 9 -> 7, >>>> 10 -> 2, 10 -> 6, 10 -> 7, 11 -> 6, 11 -> 7, 12 -> 6, 12 -> 7, >>>> 13 -> 6, 13 -> 7, 14 -> 2, 14 -> 5, 14 -> 6, 14 -> 7, 15 -> 3, >>>> 15 -> 6, 15 -> 7, 15 -> 8, 16 -> 3, 16 -> 6, 16 -> 7, 16 -> 13, >>>> 17 -> 3, 17 -> 6, 17 -> 7, 17 -> 9, 17 -> 11, 17 -> 13, 18 -> 3, >>>> 18 -> 6, 18 -> 7, 18 -> 9, 18 -> 11, 18 -> 12, 18 -> 13, 18 -> 14, >>>> 18 -> 15, 18 -> 19, 18 -> 20, 19 -> 6, 19 -> 7, 19 -> 9, 19 -> 10, >>>> 19 -> 11, 19 -> 12, 19 -> 13, 19 -> 14, 19 -> 16, 19 -> 17, 19 -> 18, >>>> 19 -> 20, 20 -> 3, 20 -> 7, 20 -> 9, 20 -> 10, 20 -> 11, 20 -> 12, >>>> 20 -> 14, 20 -> 16, 20 -> 18, 20 -> 19, 21 -> 3, 21 -> 9, 21 -> 10, >>>> 21 -> 11, 21 -> 12, 21 -> 13, 21 -> 14, 21 -> 15, 21 -> 16, 21 -> 17, >>>> 21 -> 19, 21 -> 20, 22 -> 3, 22 -> 7, 22 -> 9, 22 -> 10, 22 -> 11, >>>> 22 -> 12, 22 -> 13, 22 -> 14, 22 -> 16, 22 -> 19, 22 -> 20, 23 -> 3, >>>> 23 -> 7, 23 -> 9, 23 -> 10, 23 -> 14, 23 -> 16, 23 -> 17, 23 -> 18, >>>> 23 -> 19, 23 -> 20, 24 -> 7, 24 -> 9, 24 -> 10, 24 -> 11, 24 -> 14, >>>> 24 -> 15, 24 -> 16, 24 -> 17, 24 -> 18, 24 -> 20, 25 -> 3, 25 -> 9, >>>> 25 -> 10, 25 -> 11, 25 -> 14, 25 -> 15, 25 -> 16, 25 -> 18, 25 -> 19, >>>> 25 -> 20, 26 -> 7, 26 -> 9, 26 -> 10, 26 -> 11, 26 -> 13, 26 -> 14, >>>> 26 -> 15, 26 -> 16, 26 -> 19, 26 -> 20, 27 -> 3, 27 -> 6, 27 -> 7, >>>> 27 -> 9, 27 -> 10, 27 -> 11, 27 -> 12, 27 -> 13, 27 -> 14, 27 -> 15, >>>> 27 -> 16, 27 -> 17, 27 -> 18, 27 -> 19, 27 -> 20, 27 -> 22, 27 -> 28, >>>> 28 -> 3, 28 -> 9, 28 -> 10, 28 -> 11, 28 -> 12, 28 -> 13, 28 -> 14, >>>> 28 -> 15, 28 -> 16, 28 -> 17, 28 -> 18, 28 -> 19, 28 -> 20, 28 -> 22, >>>> 28 -> 27, 28 -> 29, 29 -> 3, 29 -> 6, 29 -> 7, 29 -> 8, 29 -> 9, >>>> 29 -> 10, 29 -> 11, 29 -> 12, 29 -> 13, 29 -> 14, 29 -> 15, 29 -> 16, >>>> 29 -> 17, 29 -> 18, 29 -> 19, 29 -> 20, 29 -> 21, 29 -> 22, >>>> 29 -> 23, 29 -> 24, 29 -> 25, 29 -> 26, 29 -> 27, 29 -> 28} >>>> >>> >>> >>> -- >>> DrMajorBob at yahoo.com >>> > > -- DrMajorBob at yahoo.com