MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: "set" data structure in Mathematica? (speeding up graph traversal function)

  • To: mathgroup at smc.vnet.net
  • Subject: [mg117603] Re: "set" data structure in Mathematica? (speeding up graph traversal function)
  • From: DrMajorBob <btreat1 at austin.rr.com>
  • Date: Thu, 24 Mar 2011 06:28:55 -0500 (EST)

I'm always hitting "Send" too fast! The last solution (my favorite) should 
have been:

Clear[next, path, p]
d = Dispatch@graph;
nodes = Union@Flatten[graph /. Rule -> List];
next[i_Integer] := next[i] = ReplaceList[i, d]
path[i_Integer] := path@p@i
path[p[a___, last_]] :=
  Module[{c = Complement[next@last, {a, last}]},
   If[c == {}, p[a, last], path@p[a, last, #] & /@ c]]
(paths = (path /@ nodes); Length@Flatten@paths) // Timing

{1.60494, 47165}

That could also start with

nodes = Union@graph[[All, 1]];

In that case, two "singleton" paths are eliminated.

Bobby

On Wed, 23 Mar 2011 15:06:39 -0500, DrMajorBob <btreat1 at austin.rr.com> 
wrote:

>> Another change was the use of Cases in place of ReplaceList, which is
>> mostly responsible for the great difference in timing you get.
>
> No, it isn't. "next" is computed exactly ONCE for each node, so this is 
> all the time involved:
>
> Clear[next]
> nodes = Union@Flatten[graph /. Rule -> List];
> next[i_Integer] := next[i] = Cases[graph, Rule[i, j_] :> j]
> next /@ nodes; // Timing
>
> {0.001473, Null}
>
> Here's the same thing with ReplaceList:
>
> Clear[next]
> nodes = Union@Flatten[graph /. Rule -> List];
> next[i_Integer] := next[i] = ReplaceList[i, graph]
> next /@ nodes; // Timing
>
> {0.002368, Null}
>
> And here it is with Dispatch, in two different ways:
>
> Clear[next]
> nodes = Union@Flatten[graph /. Rule -> List];
> next[i_Integer] := next[i] = ReplaceList[i, Dispatch@graph]
> next /@ nodes; // Timing
>
> {0.001981, Null}
>
> Clear[next]
> d = Dispatch@graph;
> nodes = Union@Flatten[graph /. Rule -> List];
> next[i_Integer] := next[i] = ReplaceList[i, d]
> next /@ nodes; // Timing
>
> {0.000291, Null}
>
> The last is a lot faster RELATIVELY, but not in absolute terms -- unless 
> your plan is to eliminate "next" entirely:
>
> Clear[path, p]
> d = Dispatch@graph;
> path[i_Integer] := path@p@i
> path[p[a___, last_]] :=
>   Module[{c = Complement[ReplaceList[last, d], {a, last}]},
>    If[c == {},
>     p[a, last],
>     path@p[a, last, #] & /@ c
>     ]
>    ]
> (paths = (path /@ nodes); Length@Flatten@paths) // Timing
>
> {1.7552, 47165}
>
> But that's no better than this:
>
> Clear[next, path, p]
> nodes = Union@Flatten[graph /. Rule -> List];
> next[i_Integer] := next[i] = Cases[graph, Rule[i, j_] :> j]
> path[i_Integer] := path@p@i
> path[p[a___, last_]] :=
>   Module[{c = Complement[next@last, {a, last}]},
>    If[c == {},
>     p[a, last],
>     path@p[a, last, #] & /@ c
>     ]
>    ]
> ((paths = Flatten[path /@ nodes] /. p -> List) // Length) // Timing
>
> {1.74741, 47165}
>
> or this:
>
> Clear[next, path, p]
> d = Dispatch@graph;
> nodes = Union@Flatten[graph /. Rule -> List];
> next[i_Integer] := next[i] = ReplaceList[i, d]
> path[i_Integer] := path@p@i
> path[p[a___, last_]] :=
>   Module[{c = Complement[ReplaceList[last, d], {a, last}]},
>    If[c == {}, p[a, last], path@p[a, last, #] & /@ c]]
> (paths = (path /@ nodes); Length@Flatten@paths) // Timing
>
> {1.74914, 47165}
>
> I like that solution best.
>
> Bobby
>
> On Wed, 23 Mar 2011 13:58:37 -0500, Szabolcs Horv=E1t <szhorvat at gmail.com> 
> wrote:
>
>> Thank you for reading the lengthy post and replying, Bobby!
>>
>> Essentially what you are doing is replacing several calls to MemberQ
>> by a single call to Complement.  This indeed works well!
>>
>> Another change was the use of Cases in place of ReplaceList, which is
>> mostly responsible for the great difference in timing you get.
>> However, using Dispatch[graph] instead of just graph in ReplaceList
>> appears to be faster than Cases.  This can be combined with your idea
>> to use Complement to get another speedup.
>>
>> Cheers,
>> Szabolcs
>>
>> 2011/3/22 DrMajorBob <btreat1 at austin.rr.com>:
>>> Here's an improvement, I think.
>>>
>>> For the test graph you listed below, I get
>>>
>>> Clear[next, path, p, listPath]
>>> nodes = Union@Flatten[graph /. Rule -> List];
>>> next[i_Integer] := next[i] = Cases[graph, Rule[i, j_] :> j]
>>> path[i_Integer] := path@p@i
>>> path[p[a___, last_]] :=
>>>  Module[{c = Complement[next@last, {a, last}]},
>>>  If[c == {},
>>>   p[a, last],
>>>   Flatten[path@p[a, last, #] & /@ c]
>>>   ]
>>>  ]
>>> (paths = (path /@ nodes); Length@Flatten[pathways]) // Timing
>>>
>>> {1.73881, 405496}
>>>
>>> Omitting Flatten and using p -> List can get a structure like yours
>>> (mostly):
>>>
>>> Clear[next, path, p, listPath]
>>> nodes = Union@Flatten[graph /. Rule -> List];
>>> next[i_Integer] := next[i] = Cases[graph, Rule[i, j_] :> j]
>>> path[i_Integer] := path@p@i
>>> path[p[a___, last_]] :=
>>>  Module[{c = Complement[next@last, {a, last}]},
>>>  If[c == {},
>>>   p[a, last],
>>>   Flatten[path@p[a, last, #] & /@ c]
>>>   ]
>>>  ]
>>> ((paths = Flatten[path /@ nodes] /. p -> List) // Length) // Timing
>>>
>>> {1.85056, 47165}
>>>
>>> Compare the above with
>>>
>>> Clear[traverse]
>>> traverse[path_][node_] :=
>>>  With[{l = ReplaceList[node, graph], p = Append[path, node]},
>>>  If[l === {}, {p},
>>>   If[MemberQ[path, node], {}, Join @@ (traverse[p] /@ l)]]]
>>> pathways = traverse[{}] /@ nodes; // Timing
>>>
>>> {8.09828, Null}
>>>
>>> paths == Cases[pathways, {__Integer}, Infinity]
>>>
>>> True
>>>
>>> Bobby
>>>
>>> On Tue, 22 Mar 2011 05:06:19 -0500, Szabolcs Horv=E1t 
>>> <szhorvat at gmail.com>
>>> wrote:
>>>
>>>> Dear MathGroup,
>>>>
>>>> I am trying to speed up a function.
>>>>
>>>> I'm looking for an efficient set-like data structure, to be used as
>>>> follows:
>>>>
>>>> I have a recursive function f, which works like this (omitting the
>>>> details):
>>>>
>>>> f[set_][value_] :=
>>>>   if value in set then
>>>>     return nothing
>>>>   otherwise
>>>>     newSet = insert value into set
>>>>     return f[newSet] /@ (list of computed values)
>>>>
>>>> In other words, the function calculates a list of values, then calls
>>>> itself recursively with each of them.  These values are collected 
>>>> into a
>>>> set during the recursive calls, and one of the stopping conditions is
>>>> that a value gets repeated.
>>>>
>>>> Note: of course there's another stopping condition as well, otherwise
>>>> the function wouldn't return anything, but that is irrelevant here.
>>>>
>>>> Also note: each branch of the recursion will have its own collection
>>>> (set) of values.
>>>>
>>>> The trivial solution would be just using a List of values, inserting 
>>>> new
>>>> values with Append, and testing whether they're already in the list 
>>>> with
>>>> MemberQ.  I found that the function spends most of its time in 
>>>> MemberQ,
>>>> so I am looking for a more efficient solution than a plain list that
>>>> needs to be iterated over to find elements in it.
>>>>
>>>> I tried using newSet = Union[set, {value}] instead of append, and
>>>> testing whether Length[newSet] === Length[set], but this turned out to
>>>> be even slower.
>>>>
>>>> I wonder if there's a better solution.
>>>>
>>>> One way would be using "function definitions", i.e. "inserting" 
>>>> 'value'
>>>> into 'set' as set[value] = True, and testing membership simply as
>>>> TrueQ[set[value]], but this object can't easily be passed around in a
>>>> recursive function because it is essentially a global variable.  Note
>>>> again that the recursion has several branches as the function is 
>>>> mapped
>>>> over a whole list of values.
>>>>
>>>> Does anyone have an idea how to solve this problem?
>>>>
>>>>  -- Szabolcs
>>>>
>>>> P.S.  The problem I'm trying to solve is to find (actually just count)
>>>> ALL acyclic paths in a directed graph, starting from a given node.
>>>> Starting from a node, I descend along the connections until there's no
>>>> way to go, or until I encounter a node that I have already visited.
>>>>
>>>> Perhaps it's better if I post the function ...
>>>>
>>>> traverse[graph_][path_][node_] :=
>>>>  With[
>>>>   {l = ReplaceList[node, graph],
>>>>    p = Append[path, node]},
>>>>   If[l === {},
>>>>    {p},
>>>>    If[
>>>>     MemberQ[path, node],
>>>>     {},
>>>>     Join @@ (traverse[graph][p] /@ l)
>>>>     ]
>>>>    ]
>>>>   ]
>>>>
>>>> graph is a list of rules representing a simple directed graph, path is
>>>> the list of already visited nodes (initially {}), node is the starting
>>>> node.  It's generally a good idea to pass the graph as a dispatch 
>>>> table
>>>> (Dispatch[]), but then MemberQ[] takes even longer to evaluate than
>>>> ReplaceList[].
>>>>
>>>> If there are cycles in the graph, there will be a large number of 
>>>> paths,
>>>> and the function will be slow.
>>>>
>>>> P.P.S
>>>>
>>>> Here's a small graph with some loops to test with:
>>>>
>>>> {1 -> 2, 3 -> 4, 5 -> 1, 5 -> 2, 6 -> 2, 6 -> 5, 7 -> 2, 7 -> 6,
>>>>  8 -> 1, 8 -> 2, 8 -> 5, 8 -> 6, 9 -> 2, 9 -> 5, 9 -> 6, 9 -> 7,
>>>>  10 -> 2, 10 -> 6, 10 -> 7, 11 -> 6, 11 -> 7, 12 -> 6, 12 -> 7,
>>>>  13 -> 6, 13 -> 7, 14 -> 2, 14 -> 5, 14 -> 6, 14 -> 7, 15 -> 3,
>>>>  15 -> 6, 15 -> 7, 15 -> 8, 16 -> 3, 16 -> 6, 16 -> 7, 16 -> 13,
>>>>  17 -> 3, 17 -> 6, 17 -> 7, 17 -> 9, 17 -> 11, 17 -> 13, 18 -> 3,
>>>>  18 -> 6, 18 -> 7, 18 -> 9, 18 -> 11, 18 -> 12, 18 -> 13, 18 -> 14,
>>>>  18 -> 15, 18 -> 19, 18 -> 20, 19 -> 6, 19 -> 7, 19 -> 9, 19 -> 10,
>>>>  19 -> 11, 19 -> 12, 19 -> 13, 19 -> 14, 19 -> 16, 19 -> 17, 19 -> 18,
>>>>   19 -> 20, 20 -> 3, 20 -> 7, 20 -> 9, 20 -> 10, 20 -> 11, 20 -> 12,
>>>>  20 -> 14, 20 -> 16, 20 -> 18, 20 -> 19, 21 -> 3, 21 -> 9, 21 -> 10,
>>>>  21 -> 11, 21 -> 12, 21 -> 13, 21 -> 14, 21 -> 15, 21 -> 16, 21 -> 17,
>>>>   21 -> 19, 21 -> 20, 22 -> 3, 22 -> 7, 22 -> 9, 22 -> 10, 22 -> 11,
>>>>  22 -> 12, 22 -> 13, 22 -> 14, 22 -> 16, 22 -> 19, 22 -> 20, 23 -> 3,
>>>>  23 -> 7, 23 -> 9, 23 -> 10, 23 -> 14, 23 -> 16, 23 -> 17, 23 -> 18,
>>>>  23 -> 19, 23 -> 20, 24 -> 7, 24 -> 9, 24 -> 10, 24 -> 11, 24 -> 14,
>>>>  24 -> 15, 24 -> 16, 24 -> 17, 24 -> 18, 24 -> 20, 25 -> 3, 25 -> 9,
>>>>  25 -> 10, 25 -> 11, 25 -> 14, 25 -> 15, 25 -> 16, 25 -> 18, 25 -> 19,
>>>>   25 -> 20, 26 -> 7, 26 -> 9, 26 -> 10, 26 -> 11, 26 -> 13, 26 -> 14,
>>>>  26 -> 15, 26 -> 16, 26 -> 19, 26 -> 20, 27 -> 3, 27 -> 6, 27 -> 7,
>>>>  27 -> 9, 27 -> 10, 27 -> 11, 27 -> 12, 27 -> 13, 27 -> 14, 27 -> 15,
>>>>  27 -> 16, 27 -> 17, 27 -> 18, 27 -> 19, 27 -> 20, 27 -> 22, 27 -> 28,
>>>>   28 -> 3, 28 -> 9, 28 -> 10, 28 -> 11, 28 -> 12, 28 -> 13, 28 -> 14,
>>>>  28 -> 15, 28 -> 16, 28 -> 17, 28 -> 18, 28 -> 19, 28 -> 20, 28 -> 22,
>>>>   28 -> 27, 28 -> 29, 29 -> 3, 29 -> 6, 29 -> 7, 29 -> 8, 29 -> 9,
>>>>  29 -> 10, 29 -> 11, 29 -> 12, 29 -> 13, 29 -> 14, 29 -> 15, 29 -> 16,
>>>>   29 -> 17, 29 -> 18, 29 -> 19, 29 -> 20, 29 -> 21, 29 -> 22,
>>>>  29 -> 23, 29 -> 24, 29 -> 25, 29 -> 26, 29 -> 27, 29 -> 28}
>>>>
>>>
>>>
>>> --
>>> DrMajorBob at yahoo.com
>>>
>
>


--
DrMajorBob at yahoo.com


  • Prev by Date: Re: "set" data structure in Mathematica? (speeding up graph traversal function)
  • Next by Date: create reusable graph style?
  • Previous by thread: Re: "set" data structure in Mathematica? (speeding up graph traversal function)
  • Next by thread: Re: "set" data structure in Mathematica? (speeding up graph traversal function)