Re: functions: compiled vs. uncompiled version
- To: mathgroup at smc.vnet.net
- Subject: [mg94145] Re: [mg94089] functions: compiled vs. uncompiled version
- From: Diego Guadagnoli <diego.guadagnoli at ph.tum.de>
- Date: Fri, 5 Dec 2008 07:04:52 -0500 (EST)
- Organization: Technische Universitaet Muenchen
- References: <200812041217.HAA27808@smc.vnet.net> <49382DA4.3070207@wolfram.com>
Dear Daniel, thank you for your prompt and helpful answer. It seems indeed that to improve speed, the main trick is to replace any function with vector objects. I compared my initial VUUSc, say VUUSc0, with - VUUSc1: here the sum1,2,3,4 variables are local to a Module, which in turn requires the use of With. However, no use of vectors for yuRos and v. - VUUSc2: your version, i.e. like VUUSc1, but with vectors. - the previous two cases with Block instead of Module VUUSc2 is the fastest. In particular: * I noticed that the use of Block instead of Module worsens speed by a factor ~ 2 (!). * Use of vectors amounts to a gain of another factor of ~3. This leads to VUUSc2 Timing being roughly 0.15 of the initial VUUSc0. Finally, the difference in the way to implement the sum in func, with Table instead of a Do, is noticeable only for VUUSc2. It amounts here to an additional 18% improvement, hence not really crucial. > Finally we fix func so that it declares VUUSc rather than VUUS as it's > complex evaluated function (took me quite a while to see this was one of > the problems). Btw, sorry for the typo in my func definition. I'll rewrite the code basing on the above guidelines. I'd have an additional comment though: the slow part of my code has all the structure VUUS/fun, exemplified in my initial email. I wonder whether a program like MathCode would straightforwardly transform such functions into FORTRAN subroutines. Thanks again, D On Thursday 04 December 2008 20:21:08 Daniel Lichtblau wrote: > Diego Guadagnoli wrote: > > Hi All, > > > > I am performing mathematica calculations involving > > many nested sums of the kind > > FUN = Sum[term[i,j,k], {i,6},{j,6},{k,2} ] or similar, > > where term[__] returns a complex number. > > > > Since I have many those sums, Timing is really long. > > Therefore I thought to implement both term[__] and FUN > > as compiled functions. I noticed however than in both > > cases Timing is not improved, actually it is worse in the > > compiled version. > > > > An example of the code is reported below as plain text. > > There are a "Needed input" and a "Functions" part. In "Functions", > > an example of term[__] is provided by the VUUS function, which is > > implemented in uncompiled (VUUS[i_, j_, k_]) and compiled form (VUUSc). > > This function is then called in the repeated sum "fun" (uncompiled) or > > respectively "func" (compiled). > > > > As you can see, the Timing in func is actually worse than in fun. > > > > Any suggestion for improving my code without translating it in FORTRAN > > would be very appreciated. > > > > Cheers, > > D > > > > > > %%%%%Please copy the content below to a mathematica notebook > > > > (*NEEDED INPUT*) > > > > BR[i_] := v[1] ZR[[1, i]] - v[2] ZR[[2, i]]; > > > > {g1, sW, v[1], v[2], yuRos[1], yuRos[2], yuRos[3]} = > > Table[Random[], {i, 7}]; > > > > AuRos = Table[RandomComplex[], {i, 3}, {j, 3}]; > > > > ZR = Table[Random[], {i, 2}, {j, 2}]; > > > > ZU = Table[RandomComplex[], {i, 6}, {j, 6}]; > > > > \[Mu]Ros = 300; > > > > > > > > (*FUNCTIONS*) > > > > VUUS[i_, j_, > > k_] := -(g1^2/3)*BR[k] (KroneckerDelta[i, j] + (3 - 8 sW^2)/(4 sW^2) > > Sum[Conjugate[ZU[[I, i]]] ZU[[I, j]], {I, 1, 3}]) - > > Sum[v[2] (yuRos[I])^2 > > ZR[[2, k]] (Conjugate[ZU[[I, i]]] ZU[[I, j]] + > > Conjugate[ZU[[I + 3, i]]] ZU[[I + 3, j]]), {I, 1, 3}] + > > Sum[1/Sqrt[2] > > ZR[[2, k]] (Conjugate[AuRos[[I, J]]] Conjugate[ZU[[I, i]]] > > ZU[[J + 3, j]] + > > AuRos[[I, J]] ZU[[I, j]] Conjugate[ZU[[J + 3, i]]]), {I, 1, > > 3}, {J, 1, 3}] + > > Sum[1/Sqrt[2] yuRos[I] > > ZR[[1, k]] (Conjugate[\[Mu]Ros] ZU[[I, j]] > > Conjugate[ZU[[I + 3, i]]] + \[Mu]Ros Conjugate[ZU[[I, i]]] > > ZU[[I + 3, j]]), {I, 1, 3}]; > > > > VUUSc = Compile[{{i, _Integer}, {j, _Integer}, {k, _Integer}}, > > sum1 = 0. + 0. I; > > Do[sum1 = sum1 + Conjugate[ZU[[ii, i]]] ZU[[ii, j]], {ii, 1, 3}]; > > sum2 = 0. + 0. I; > > Do[sum2 = > > sum2 + v[2] (yuRos[ii])^2 > > ZR[[2, k]] (Conjugate[ZU[[ii, i]]] ZU[[ii, j]] + > > Conjugate[ZU[[ii + 3, i]]] ZU[[ii + 3, j]]), {ii, 1, 3}]; > > sum3 = 0. + 0. I; > > Do[sum3 = > > sum3 + 1/Sqrt[2] > > ZR[[2, k]] (Conjugate[AuRos[[ii, J]]] Conjugate[ZU[[ii, i]]] > > ZU[[J + 3, j]] + > > AuRos[[ii, J]] ZU[[ii, j]] Conjugate[ZU[[J + 3, i]]]), {ii, > > 1, 3}, {J, 1, 3}]; > > sum4 = 0. + 0. I; > > Do[sum4 = > > sum4 + 1/Sqrt[2] yuRos[ii] > > ZR[[1, k]] (Conjugate[\[Mu]Ros] ZU[[ii, j]] > > Conjugate[ZU[[ii + 3, i]]] + \[Mu]Ros Conjugate[ > > ZU[[ii, i]]] ZU[[ii + 3, j]]), {ii, 1, 3}]; > > -(g1^2/3) > > BR[k] (KroneckerDelta[i, j] + (3 - 8 sW^2)/(4 sW^2) sum1) - > > sum2 + sum3 + sum4, > > {{BR[_], _Real}, {ZU, _Complex, > > 6}, {v[_], _Real}, {yuRos[_], _Real}, {ZR, _Real, > > 2}, {AuRos, _Complex, 3}, {\[Mu]Ros, _Complex}}]; > > > > fun = Compile[{{k, _Integer}}, > > sum1 = 0. + 0. I; > > Do[sum1 = sum1 + VUUS[l, m, k], {l, 1, 6}, {m, 1, 6}]; > > -sum1, {{VUUS[__], _Complex}} > > ]; > > > > func = Compile[{{k, _Integer}}, > > sum1 = 0. + 0. I; > > Do[sum1 = sum1 + VUUSc[l, m, k], {l, 1, 6}, {m, 1, 6}]; > > -sum1, {{VUUS[__], _Complex}} > > ]; > > > > VUUS[1, 1, 1] // Timing > > > > VUUSc[1, 1, 1] // Timing > > > > fun[1] // Timing > > > > func[1] // Timing > > Making this fast is indeed a bit tricky. First thing to realize is if > VUUSc[[4]] shows function evaluations, you'll have trouble. > > I changed slightly your definitions so that I could use vectors rather > than indexed symbols (things like vvec[[j]] rather than v[j]). I'm not > sure this was really necessary. Anyway, here is what I use. > > BR[i_] := v[1] ZR[[1, i]] - v[2] ZR[[2, i]]; > {g1, sW, v[1], v[2], yuRos[1], yuRos[2], yuRos[3]} = > Table[Random[], {i, 7}]; > yuRosvec = {yuRos[1], yuRos[2], yuRos[3]}; > vvec = {v[1], v[2]}; > AuRos = Table[RandomComplex[], {i, 3}, {j, 3}]; > ZR = Table[Random[], {i, 2}, {j, 2}]; > ZU = Table[RandomComplex[], {i, 6}, {j, 6}]; > \[Mu]Ros = 300; > > In order to get a good compiled version, we now insert that actual > arrays into the Compile. This can be done using With, as below. > > VUUSc = With[{ZU = ZU, ZR = ZR, yuRosvec = yuRosvec, AuRos = AuRos, > g1 = g1, sW = sW, vvec = vvec, \[Mu]Ros = \[Mu]Ros}, > Compile[{{i, _Integer}, {j, _Integer}, {k, _Integer}}, > Module[{sum1, sum2, sum3, sum4}, > sum1 = 0. + 0. I; > Do[sum1 = sum1 + Conjugate[ZU[[ii, i]]] ZU[[ii, j]], {ii, 1, > 3}]; > sum2 = 0. + 0. I; > Do[sum2 = > sum2 + Evaluate[ > vvec[[2]]] (yuRosvec[[ii]])^2 ZR[[2, > k]] (Conjugate[ZU[[ii, i]]] ZU[[ii, j]] + > Conjugate[ZU[[ii + 3, i]]] ZU[[ii + 3, j]]), {ii, 1, 3}]; > sum3 = 0. + 0. I; > Do[sum3 = > sum3 + 1/ > Sqrt[2] ZR[[2, > k]] (Conjugate[AuRos[[ii, J]]] Conjugate[ > ZU[[ii, i]]] ZU[[J + 3, j]] + > AuRos[[ii, J]] ZU[[ii, j]] Conjugate[ZU[[J + 3, i]]]), {ii, > 1, 3}, {J, 1, 3}]; > sum4 = 0. + 0. I; > Do[sum4 = > sum4 + 1/ > Sqrt[2] yuRosvec[[ii]] ZR[[1, > k]] (Conjugate[\[Mu]Ros] ZU[[ii, j]] Conjugate[ > ZU[[ii + 3, i]]] + \[Mu]Ros Conjugate[ > ZU[[ii, i]]] ZU[[ii + 3, j]]), {ii, 1, 3}]; > -(g1^2/3) Evaluate[ > BR[k] ] (If[i == j, 1, 0] + (3 - 8 sW^2)/(4 sW^2) sum1) - > sum2 + sum3 + sum4], {{BR[_], _Real}}]]; > > Finally we fix func so that it declares VUUSc rather than VUUS as it's > complex evaluated function (took me quite a while to see this was one of > the problems). > > func = Compile[{{k, _Integer}}, > -Total[ > Flatten[Table[ > VUUSc[l, m, k], {l, 1, 6}, {m, 1, > 6}]]], {{VUUSc[__], _Complex}}]; > > Now compare results in speed. > > In[223]:= fun[1] // Timing > Out[223]= {0.013998, -4438.04 + 2.13163*10^-14 I} > > In[224]:= func[1] // Timing > Out[224]= {0.002, -4438.04 + 2.13163*10^-14 I} > > In[229]:= Do[fun[1], {100}] // Timing > Out[229]= {1.10383, Null} > > In[230]:= Do[func[1], {100}] // Timing > Out[230]= {0.155976, Null} > > > Daniel Lichtblau > Wolfram Research -- #################################### Diego Guadagnoli http://users.physik.tu-muenchen.de/guadagno/ ####################################
- References:
- functions: compiled vs. uncompiled version
- From: Diego Guadagnoli <diego.guadagnoli@ph.tum.de>
- functions: compiled vs. uncompiled version