MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: "set" data structure in Mathematica? (speeding up

  • To: mathgroup at smc.vnet.net
  • Subject: [mg117573] Re: "set" data structure in Mathematica? (speeding up
  • From: Daniel Lichtblau <danl at wolfram.com>
  • Date: Wed, 23 Mar 2011 02:54:03 -0500 (EST)

Szabolcs Horv=E1t wrote:
> Dear MathGroup,
>
> I am trying to speed up a function.
>
> I'm looking for an efficient set-like data structure, to be used as follows:
>
> I have a recursive function f, which works like this (omitting the details):
>
> f[set_][value_] :=
>    if value in set then
>      return nothing
>    otherwise
>      newSet = insert value into set
>      return f[newSet] /@ (list of computed values)
>
> In other words, the function calculates a list of values, then calls
> itself recursively with each of them.  These values are collected into a
> set during the recursive calls, and one of the stopping conditions is
> that a value gets repeated.

This part is unclear to me. Where do the "list of computed values" come
from?

Anyway, the approach below might be reasonable.

membership[{elems__}] := Do[
    isMemberOf[{elems}][elems[[j]]] = True,
    {j,Length[elems]}]

f[{elems___}[val_] :=
   If[isMemberOf[{elems}][val],
     Null,
     With[{new={elems,val}},
       membership[new];
       Map[f[new],(list of computed values)]]
       ]

Something else to test might be whether MatchQ[val,Alternatives@@list]
might be faster than MemberQ[list,val]. Yet another idea would be to
:nest" the elements. That way the list rebuilding in that With will be
faster, and you can recode membership to walk down a nested list instead
of a flat one.

I've not given thought to the underlying problem of finding all acyclic
paths. But offhand I'd suspect some form of node caching, as you
suggest, would make sense.

Daniel Lichtblau
Wolfram Research



> Note: of course there's another stopping condition as well, otherwise
> the function wouldn't return anything, but that is irrelevant here.
>
> Also note: each branch of the recursion will have its own collection
> (set) of values.
>
> The trivial solution would be just using a List of values, inserting new
> values with Append, and testing whether they're already in the list with
> MemberQ.  I found that the function spends most of its time in MemberQ,
> so I am looking for a more efficient solution than a plain list that
> needs to be iterated over to find elements in it.
>
> I tried using newSet = Union[set, {value}] instead of append, and
> testing whether Length[newSet] === Length[set], but this turned out to
> be even slower.
>
> I wonder if there's a better solution.
>
> One way would be using "function definitions", i.e. "inserting" 'value'
> into 'set' as set[value] = True, and testing membership simply as
> TrueQ[set[value]], but this object can't easily be passed around in a
> recursive function because it is essentially a global variable.  Note
> again that the recursion has several branches as the function is mapped
> over a whole list of values.
>
> Does anyone have an idea how to solve this problem?
>
>   -- Szabolcs
>
> P.S.  The problem I'm trying to solve is to find (actually just count)
> ALL acyclic paths in a directed graph, starting from a given node.
> Starting from a node, I descend along the connections until there's no
> way to go, or until I encounter a node that I have already visited.
>
> Perhaps it's better if I post the function ...
>
> traverse[graph_][path_][node_] :=
>   With[
>    {l = ReplaceList[node, graph],
>     p = Append[path, node]},
>    If[l === {},
>     {p},
>     If[
>      MemberQ[path, node],
>      {},
>      Join @@ (traverse[graph][p] /@ l)
>      ]
>     ]
>    ]
>
> graph is a list of rules representing a simple directed graph, path is
> the list of already visited nodes (initially {}), node is the starting
> node.  It's generally a good idea to pass the graph as a dispatch table
> (Dispatch[]), but then MemberQ[] takes even longer to evaluate than
> ReplaceList[].
>
> If there are cycles in the graph, there will be a large number of paths,
> and the function will be slow.
>
> P.P.S
>
> Here's a small graph with some loops to test with:
>
> {1 -> 2, 3 -> 4, 5 -> 1, 5 -> 2, 6 -> 2, 6 -> 5, 7 -> 2, 7 -> 6,
>   8 -> 1, 8 -> 2, 8 -> 5, 8 -> 6, 9 -> 2, 9 -> 5, 9 -> 6, 9 -> 7,
>   10 -> 2, 10 -> 6, 10 -> 7, 11 -> 6, 11 -> 7, 12 -> 6, 12 -> 7,
>   13 -> 6, 13 -> 7, 14 -> 2, 14 -> 5, 14 -> 6, 14 -> 7, 15 -> 3,
>   15 -> 6, 15 -> 7, 15 -> 8, 16 -> 3, 16 -> 6, 16 -> 7, 16 -> 13,
>   17 -> 3, 17 -> 6, 17 -> 7, 17 -> 9, 17 -> 11, 17 -> 13, 18 -> 3,
>   18 -> 6, 18 -> 7, 18 -> 9, 18 -> 11, 18 -> 12, 18 -> 13, 18 -> 14,
>   18 -> 15, 18 -> 19, 18 -> 20, 19 -> 6, 19 -> 7, 19 -> 9, 19 -> 10,
>   19 -> 11, 19 -> 12, 19 -> 13, 19 -> 14, 19 -> 16, 19 -> 17, 19 -> 18,
>    19 -> 20, 20 -> 3, 20 -> 7, 20 -> 9, 20 -> 10, 20 -> 11, 20 -> 12,
>   20 -> 14, 20 -> 16, 20 -> 18, 20 -> 19, 21 -> 3, 21 -> 9, 21 -> 10,
>   21 -> 11, 21 -> 12, 21 -> 13, 21 -> 14, 21 -> 15, 21 -> 16, 21 -> 17,
>    21 -> 19, 21 -> 20, 22 -> 3, 22 -> 7, 22 -> 9, 22 -> 10, 22 -> 11,
>   22 -> 12, 22 -> 13, 22 -> 14, 22 -> 16, 22 -> 19, 22 -> 20, 23 -> 3,
>   23 -> 7, 23 -> 9, 23 -> 10, 23 -> 14, 23 -> 16, 23 -> 17, 23 -> 18,
>   23 -> 19, 23 -> 20, 24 -> 7, 24 -> 9, 24 -> 10, 24 -> 11, 24 -> 14,
>   24 -> 15, 24 -> 16, 24 -> 17, 24 -> 18, 24 -> 20, 25 -> 3, 25 -> 9,
>   25 -> 10, 25 -> 11, 25 -> 14, 25 -> 15, 25 -> 16, 25 -> 18, 25 -> 19,
>    25 -> 20, 26 -> 7, 26 -> 9, 26 -> 10, 26 -> 11, 26 -> 13, 26 -> 14,
>   26 -> 15, 26 -> 16, 26 -> 19, 26 -> 20, 27 -> 3, 27 -> 6, 27 -> 7,
>   27 -> 9, 27 -> 10, 27 -> 11, 27 -> 12, 27 -> 13, 27 -> 14, 27 -> 15,
>   27 -> 16, 27 -> 17, 27 -> 18, 27 -> 19, 27 -> 20, 27 -> 22, 27 -> 28,
>    28 -> 3, 28 -> 9, 28 -> 10, 28 -> 11, 28 -> 12, 28 -> 13, 28 -> 14,
>   28 -> 15, 28 -> 16, 28 -> 17, 28 -> 18, 28 -> 19, 28 -> 20, 28 -> 22,
>    28 -> 27, 28 -> 29, 29 -> 3, 29 -> 6, 29 -> 7, 29 -> 8, 29 -> 9,
>   29 -> 10, 29 -> 11, 29 -> 12, 29 -> 13, 29 -> 14, 29 -> 15, 29 -> 16,
>    29 -> 17, 29 -> 18, 29 -> 19, 29 -> 20, 29 -> 21, 29 -> 22,
>   29 -> 23, 29 -> 24, 29 -> 25, 29 -> 26, 29 -> 27, 29 -> 28}
>



  • Prev by Date: Re: Using Nearest on a group of points
  • Next by Date: Problem with DateListPlot Aspect Ratio
  • Previous by thread: Need help to solve FFT and IFFT by mathematica
  • Next by thread: Re: "set" data structure in Mathematica? (speeding up