Re: "set" data structure in Mathematica? (speeding up
- To: mathgroup at smc.vnet.net
- Subject: [mg117634] Re: "set" data structure in Mathematica? (speeding up
- From: David Bevan <david.bevan at pb.com>
- Date: Tue, 29 Mar 2011 06:50:47 -0500 (EST)
Your algorithm seems to be counting the number of distinct acyclic paths from a specified node to any sink node (maximal paths below).
The following seems to be faster than any solution proposed so far.
gReachableFrom[g,s] gives a list of the nodes that can be reached along a path from s.
gCanReach[g,t] gives a list of the nodes from which there is a path to t.
ClearAll[gNodes,gSinkNodes,gNext,gPrev,gReachableFrom,gCanReach]
gNodes[g_]:=Union@Flatten[g/.Rule->List]
gSinkNodes[g_]:=Complement[gNodes[g],Union[First/@g]]
gNext[g_,s_]:=gNext[g,s]=ReplaceList[s,g]
gPrev[g_,t_]:=gPrev[g,t]=ReplaceList[t,Reverse[g,2]]
gReachableFrom[g_,s_]:=gReachableFrom[g,s]=FixedPoint[Union@@Join[{#},(gNext[g,#]&)/@#]&,{s}]
gCanReach[g_,t_]:=gCanReach[g,t]=FixedPoint[Union@@Join[{#},(gPrev[g,#]&)/@#]&,{t}]
gPaths[g,s,t,x] gives a list of the paths from s to t excluding nodes in the list x.
gPaths2 'normalises' x before calling gPaths3. This is the key to the performance.
ClearAll[gMaximalPaths,gPaths,gPaths2,gPaths3]
gMaximalPaths[g_,s_]:=Flatten[(gPaths[g,s,#,{}]&)/@gSinkNodes[g],1]
gPaths[g_,s_,t_,x_]:=gPaths[g,s,t,x]=If[MemberQ[gReachableFrom[g,s],t],gPaths2[g,s,t,x],{}]
gPaths2[g_,s_,t_,x_]:=gPaths2[g,s,t,x]=gPaths3[g,s,t,Intersection[x,gReachableFrom[g,s],gCanReach[g,t]]]
gPaths3[g_,s_,s_,x_]:=gPaths3[g,s,s,x]={{s}}
gPaths3[g_,s_,t_,x_]:=gPaths3[g,s,t,x]=(Prepend[#,s]&)/@Flatten[(gPaths[g,#,t,Prepend[x,s]]&)/@Complement[gNext[g,s],x],1]
gMaximalPaths[gtest,#]&/@gNodes[gtest]//Flatten[#,1]&//Length//Timing
{0.328,47165}
This compares with about 2 seconds for DrMajorBob's algorithms on my PC.
Counting without listing all the paths is a little faster.
ClearAll[gNumMaximalPaths,gNumPaths,gNumPaths2,gNumPaths3]
gNumMaximalPaths[g_,s_]:=Total[(gNumPaths[g,s,#,{}]&)/@gSinkNodes[g]]
gNumPaths[g_,s_,t_,x_]:=gNumPaths[g,s,t,x]=If[MemberQ[gReachableFrom[g,s],t],gNumPaths2[g,s,t,x],0]
gNumPaths2[g_,s_,t_,x_]:=gNumPaths2[g,s,t,x]=gNumPaths3[g,s,t,Intersection[x,gReachableFrom[g,s],gCanReach[g,t]]]
gNumPaths3[g_,s_,s_,x_]:=gNumPaths3[g,s,s,x]=1
gNumPaths3[g_,s_,t_,x_]:=gNumPaths3[g,s,t,x]=Total[(gNumPaths[g,#,t,Prepend[x,s]]&)/@Complement[gNext[g,s],x]]
Total[gNumMaximalPaths[gtest,#]&/@gNodes[gtest]]//Timing
{0.203,47165}
Since all the intermeidate results are saved, a second run is much faster.
David %^>