MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: "set" data structure in Mathematica? (speeding up

  • To: mathgroup at smc.vnet.net
  • Subject: [mg117706] Re: "set" data structure in Mathematica? (speeding up
  • From: DrMajorBob <btreat1 at austin.rr.com>
  • Date: Wed, 30 Mar 2011 04:10:47 -0500 (EST)

That solution is remarkable, 3 times faster than my 2x improvement on Ray  
Koopman's code!

Below is another 60% speed-up, mostly from avoiding pattern matching on  
the argument "g", which doesn't change.

I also eliminated some "memoization" that wasn't saving time:

ClearAll[gNodes, gSinkNodes, gNext, gPrev, gReachableFrom, gCanReach, \
gBetween, gMaximalPaths, gPaths, gPaths2, gPaths3]
g = graph; (* you called this gtest *)
gReverse = Reverse[g, 2];
gNodes = Union@Flatten[g /. Rule -> List];
gSinkNodes = Complement[gNodes, Union[First /@ g]];

gNext[s_] := gNext[s] = ReplaceList[s, g]
gPrev[t_] := gPrev[t] = ReplaceList[t, gReverse]
gReachableFrom[s_] :=
  gReachableFrom[s] =
   FixedPoint[Union @@ Join[{#}, gNext /@ #] &, {s}]
gCanReach[t_] :=
  gCanReach[t] = FixedPoint[Union @@ Join[{#}, gPrev /@ #] &, {t}]
gCanReach[t_, x_] :=
  gCanReach[t, x] =
   FixedPoint[Complement[Union @@ Join[{#}, gPrev /@ #], x] &, {t}]
gBetween[s_, t_] :=
  gBetween[s, t] = Intersection[gReachableFrom[s], gCanReach[t, {s}]]
gNext[s_, t_] := Intersection[gNext[s], gBetween[s, t]]
gMaximalPaths[s_] := Flatten[(gPaths[s, #, {}] &) /@ gSinkNodes, 1]
gPaths[s_, t_, x_] /; MemberQ[gReachableFrom[s], t] := gPaths2[s, t, x]
gPaths[s_, t_, x_] := {}
gPaths2[s_, t_, x_] := gPaths3[s, t, Intersection[x, gBetween[s, t]]]
gPaths3[s_, s_, x_] := {{s}}
gPaths3[s_, t_, x_] :=
  gPaths3[s, t, x] = (Prepend[#, s] &) /@
    Flatten[(gPaths2[#, t, Prepend[x, s]] &) /@
      Complement[gNext[s, t], x], 1]

Length /@ (gtest = gMaximalPaths /@ gNodes) // Timing

{0.097408, {1, 1, 1, 1, 2, 3, 4, 7, 10, 8, 7, 7, 7, 10, 15, 15, 32,
   394, 355, 396, 863, 820, 1225, 891, 1211, 827, 11269, 13353, 15430}}

Your code's timing was 0.15977 at this machine.

Bobby

On Tue, 29 Mar 2011 06:50:58 -0500, David Bevan <david.bevan at pb.com> wrote:

> Here's an improvement:
>
> gCanReach[g,t,x] gives a list of the nodes from which there is a path to  
> t avoiding nodes in list x.
> gBetween[g,s,t] gives a list of the nodes that can be reached along a  
> path from s to t which does not revisit s.
> gNext[g,s,t] gives a list of the nodes one step from s towards t.
>
> By using these, we avoid searching down "blind alleys".
>
>
> ClearAll[gNodes,gSinkNodes,gNext,gPrev,gReachableFrom,gCanReach,gBetween]
>
> gNodes[g_]:=gNodes[g]=Union@Flatten[g/.Rule->List]
>
> gSinkNodes[g_]:=Complement[gNodes[g],Union[First/@g]]
>
> gNext[g_,s_]:=gNext[g,s]=ReplaceList[s,g]
>
> gPrev[g_,t_]:=gPrev[g,t]=ReplaceList[t,Reverse[g,2]]
>
> gReachableFrom[g_,s_]:=gReachableFrom[g,s]=FixedPoint[Union@@Join[{#},(gNext[g,#]&)/@#]&,{s}]
>
> gCanReach[g_,t_]:=gCanReach[g,t]=FixedPoint[Union@@Join[{#},(gPrev[g,#]&)/@#]&,{t}]
>
> gCanReach[g_,t_,x_]:=gCanReach[g,t,x]=FixedPoint[Complement[Union@@Join[{#},(gPrev[g,#]&)/@#],x]&,{t}]
>
> gBetween[g_,s_,t_]:=gBetween[g,s,t]=Intersection[gReachableFrom[g,s],gCanReach[g,t,{s}]]
>
> gNext[g_,s_,t_]:=gNext[g,s,t]=Intersection[gNext[g,s],gBetween[g,s,t]]
>
>
>
> ClearAll[gMaximalPaths,gPaths,gPaths2,gPaths3]
>
> gMaximalPaths[g_,s_]:=Flatten[(gPaths[g,s,#,{}]&)/@gSinkNodes[g],1]
>
> gPaths[g_,s_,t_,x_]:=gPaths[g,s,t,x]=If[MemberQ[gReachableFrom[g,s],t],gPaths2[g,s,t,x],{}]
>
> gPaths2[g_,s_,t_,x_]:=gPaths2[g,s,t,x]=gPaths3[g,s,t,Intersection[x,gBetween[g,s,t]]]
>
> gPaths3[g_,s_,s_,x_]:=gPaths3[g,s,s,x]={{s}}
>
> gPaths3[g_,s_,t_,x_]:=gPaths3[g,s,t,x]=(Prepend[#,s]&)/@Flatten[(gPaths2[g,#,t,Prepend[x,s]]&)/@Complement[gNext[g,s,t],x],1]
>
>
>
> gMaximalPaths[gtest,#]&/@gNodes[gtest]//Flatten[#,1]&//Length//Timing
>
> {0.203,47165}
>
>
>
> ClearAll[gNumMaximalPaths,gNumPaths,gNumPaths2,gNumPaths3]
>
> gNumMaximalPaths[g_,s_]:=Total[(gNumPaths[g,s,#,{}]&)/@gSinkNodes[g]]
>
> gNumPaths[g_,s_,t_,x_]:=gNumPaths[g,s,t,x]=If[MemberQ[gReachableFrom[g,s],t],gNumPaths2[g,s,t,x],0]
>
> gNumPaths2[g_,s_,t_,x_]:=gNumPaths2[g,s,t,x]=gNumPaths3[g,s,t,Intersection[x,gBetween[g,s,t]]]
>
> gNumPaths3[g_,s_,s_,x_]:=gNumPaths3[g,s,s,x]=1
>
> gNumPaths3[g_,s_,t_,x_]:=gNumPaths3[g,s,t,x]=Total[(gNumPaths2[g,#,t,Prepend[x,s]]&)/@Complement[gNext[g,s,t],x]]
>
>
>
> Total[gNumMaximalPaths[gtest,#]&/@gNodes[gtest]]//Timing
>
> {0.093,47165}
>
>
> This is now about 100 times faster than your original algorithm for the  
> sample data you provided.
>
> David %^>
>
>
> ________________________________
> From: David Bevan
> Sent: 25 March 2011 21:35
> To: mathgroup at smc.vnet.net; szhorvat at gmail.com
> Subject: [mg117635] RE: [mg117576] Re: "set" data structure in  
> Mathematica? (speeding up
>
>
>
>
> Your algorithm seems to be counting the number of distinct acyclic paths  
> from a specified node to any sink node (maximal paths below).
>
>
>
> The following seems to be faster than any solution proposed so far.
>
>
>
> gReachableFrom[g,s] gives a list of the nodes that can be reached along  
> a path from s.
>
> gCanReach[g,t] gives a list of the nodes from which there is a path to t.
>
>
>
> ClearAll[gNodes,gSinkNodes,gNext,gPrev,gReachableFrom,gCanReach]
>
> gNodes[g_]:=Union@Flatten[g/.Rule->List]
>
> gSinkNodes[g_]:=Complement[gNodes[g],Union[First/@g]]
>
> gNext[g_,s_]:=gNext[g,s]=ReplaceList[s,g]
>
> gPrev[g_,t_]:=gPrev[g,t]=ReplaceList[t,Reverse[g,2]]
>
> gReachableFrom[g_,s_]:=gReachableFrom[g,s]=FixedPoint[Union@@Join[{#},(gNext[g,#]&)/@#]&,{s}]
>
> gCanReach[g_,t_]:=gCanReach[g,t]=FixedPoint[Union@@Join[{#},(gPrev[g,#]&)/@#]&,{t}]
>
>
>
> gPaths[g,s,t,x] gives a list of the paths from s to t excluding nodes in  
> the list x.
>
> gPaths2 'normalises' x before calling gPaths3. This is the key to the  
> performance.
>
>
>
> ClearAll[gMaximalPaths,gPaths,gPaths2,gPaths3]
>
> gMaximalPaths[g_,s_]:=Flatten[(gPaths[g,s,#,{}]&)/@gSinkNodes[g],1]
>
> gPaths[g_,s_,t_,x_]:=gPaths[g,s,t,x]=If[MemberQ[gReachableFrom[g,s],t],gPaths2[g,s,t,x],{}]
>
> gPaths2[g_,s_,t_,x_]:=gPaths2[g,s,t,x]=gPaths3[g,s,t,Intersection[x,gReachableFrom[g,s],gCanReach[g,t]]]
>
> gPaths3[g_,s_,s_,x_]:=gPaths3[g,s,s,x]={{s}}
>
> gPaths3[g_,s_,t_,x_]:=gPaths3[g,s,t,x]=(Prepend[#,s]&)/@Flatten[(gPaths[g,#,t,Prepend[x,s]]&)/@Complement[gNext[g,s],x],1]
>
>
>
> gMaximalPaths[gtest,#]&/@gNodes[gtest]//Flatten[#,1]&//Length//Timing
>
> {0.328,47165}
>
>
>
> This compares with about 2 seconds for DrMajorBob's algorithms on my PC.
>
>
>
> Counting without listing all the paths is a little faster.
>
>
>
> ClearAll[gNumMaximalPaths,gNumPaths,gNumPaths2,gNumPaths3]
>
> gNumMaximalPaths[g_,s_]:=Total[(gNumPaths[g,s,#,{}]&)/@gSinkNodes[g]]
>
> gNumPaths[g_,s_,t_,x_]:=gNumPaths[g,s,t,x]=If[MemberQ[gReachableFrom[g,s],t],gNumPaths2[g,s,t,x],0]
>
> gNumPaths2[g_,s_,t_,x_]:=gNumPaths2[g,s,t,x]=gNumPaths3[g,s,t,Intersection[x,gReachableFrom[g,s],gCanReach[g,t]]]
>
> gNumPaths3[g_,s_,s_,x_]:=gNumPaths3[g,s,s,x]=1
>
> gNumPaths3[g_,s_,t_,x_]:=gNumPaths3[g,s,t,x]=Total[(gNumPaths[g,#,t,Prepend[x,s]]&)/@Complement[gNext[g,s],x]]
>
>
>
> Total[gNumMaximalPaths[gtest,#]&/@gNodes[gtest]]//Timing
>
> {0.203,47165}
>
>
>
> Since all the intermeidate results are saved, a second run is much  
> faster.
>
>
>
> David %^>


-- 
DrMajorBob at yahoo.com


  • Prev by Date: Re: Why Mathematica does not issue a warning when the calculations
  • Next by Date: Re: Importing into Mathematica from URL (PubMed)
  • Previous by thread: Re: "set" data structure in Mathematica? (speeding up
  • Next by thread: Problem with DateListPlot Aspect Ratio