Re: compile speed

*To*: mathgroup at smc.vnet.net*Subject*: [mg74530] Re: compile speed*From*: "Boson" <sandro.romani at gmail.com>*Date*: Sat, 24 Mar 2007 05:21:15 -0500 (EST)*References*: <etqnm3$nf$1@smc.vnet.net><ett7ba$hll$1@smc.vnet.net>

On Mar 22, 7:22 am, "Ray Koopman" <koop... at sfu.ca> wrote: > On Mar 21, 12:42 am, "Boson" <sandro.rom... at gmail.com> wrote: > > > > > dear mathematica users, > > > i've written a simple function that works on a pair of binary > > matrices: > > (mathematica 5.2 linux, on a 32 bit platform) > > > tab[nx_, ny_, frac_] := Table[If[Random[] < frac, 1, 0], {nx}, {ny}] > > > nx = 25; ny = 50; frac1 = 0.1; frac2 = 0.5; > > p1 = 0.4; p2 = 0.2; > > tabrect = tab[nx, ny, frac1]; > > tabsq = tab[ny, ny, frac2]; > > > testnocomp[mat1_, mat2_, n1_, n2_, pp1_, pp2_] := Module[{tmp, sum, > > val}, > > tmp = mat2; Do[sum = mat1[[k,j]] + mat2[[k,i]]; > > val = Which[sum == 2, If[Random[] < pp1, 1, tmp[[i,j]]], sum == > > 1, > > If[Random[] < pp2, 0, tmp[[i,j]]], sum == 0, tmp[[i,j]]]; > > tmp[[i,j]] = val, > > {k, n1}, {i, n2}, {j, n2}]; tmp]; > > > Timing[resnc = testnocomp[tabrect, tabsq, nx, ny, p1, p2]; ] > > > the result of the timing is > > {0.7558840000000013*Second, Null} > > > since i need high values of nx,ny (~5000) and the loop scales as > > nx*ny^2, > > i tried to implement a compiled version of the previous function: > > > test := Compile[{{mat1, _Integer, 2}, {mat2, _Integer, 2}, {n1, > > _Integer}, {n2, _Integer}, > > {pp1, _Real}, {pp2, _Real}}, Module[{tmp, sum, val}, > > tmp = mat2; Do[sum = mat1[[k,j]] + mat2[[k,i]]; > > val = Which[sum == 2, If[Random[] < pp1, 1, tmp[[i,j]]], sum > > == 1, > > If[Random[] < pp2, 0, tmp[[i,j]]], sum == 0, tmp[[i,j]]]; > > tmp[[i,j]] = val, > > {k, n1}, {i, n2}, {j, n2}]; tmp], {{Random[_], _Real}}]; > > > Timing[res = test[tabrect, tabsq, nx, ny, p1, p2]; ] > > > results are a disaster: > > {14.814747999999996*Second, Null} > > > i'm sure this is related to my poor mathematica programming > > experience.. > > > could you suggest me a faster version to solve this problem? > > > regards, > > sandro > > First, whenever you Compile a function that is at all complicated > (and perhaps always), you should look at its InputForm: > > InputForm[test] > > CompiledFunction[{{_Integer, 2}, {_Integer, 2}, _Integer, > _Integer, _Real, _Real}, {{2, 2, 0}, {2, 2, 1}, {2, 0, 0}, > {2, 0, 1}, {3, 0, 0}, {3, 0, 1}, {2, 2, 2}}, > {0, 10, 3, 0, 3}, {{1, 5}, {12, 1, 2}, {9, 0, 2}, {9, 1, 3}, > {9, 1, 4}, {4, 0, 5}, {79, 5, 2, 11}, {4, 0, 6}, > {79, 6, 3, -2}, {4, 0, 7}, {79, 7, 4, -2}, > {64, 0, 0, 5, 0, 7, 0, 8}, {64, 1, 0, 5, 0, 6, 0, 9}, > {24, 8, 9, 8}, {21, Function[{mat1, mat2, n1, n2, pp1, > pp2}, Which[sum == 2, If[Random[] < pp1, 1, tmp[[i,j]]], > sum == 1, If[Random[] < pp2, 0, tmp[[i,j]]], sum == 0, > tmp[[i,j]]]], {sum, 2, 0, 8, Module}, > {i, 2, 0, 6, Block}, {tmp, 2, 2, 2, Module}, > {j, 2, 0, 7, Block}, {k, 2, 0, 5, Block}, 2, 2, 0, 2, 2, > 1, 2, 0, 0, 2, 0, 1, 3, 0, 0, 3, 0, 1, 3, 0, 2}, > {21, Function[{mat1, mat2, n1, n2, pp1, pp2}, > tmp[[i,j]] = val], {val, 3, 0, 2, Module}, > {i, 2, 0, 6, Block}, {tmp, 2, 2, 2, Module}, > {j, 2, 0, 7, Block}, {k, 2, 0, 5, Block}, 2, 2, 0, 2, 2, > 1, 2, 0, 0, 2, 0, 1, 3, 0, 0, 3, 0, 1, 6, 0, 17}, > {42, -6}, {2}}, Function[{mat1, mat2, n1, n2, pp1, pp2}, > Module[{tmp, sum, val}, tmp = mat2; > Do[sum = mat1[[k,j]] + mat2[[k,i]]; > val = Which[sum == 2, If[Random[] < pp1, 1, > tmp[[i,j]]], sum == 1, If[Random[] < pp2, 0, > tmp[[i,j]]], sum == 0, tmp[[i,j]]]; > tmp[[i,j]] = val, {k, n1}, {i, n2}, {j, n2}]; tmp]], > Evaluate] > > If there is ordinary Mathematica code in the middle of the compiled > code (other than the block at the end, which is always there), then > the compiled code will probably be slower than the uncompiled code. > > Here are three successively faster versions of the uncompiled code. > All three put the k-loop inside the i- and j-loops and give the same > result, which necessarily differs from yours because changing the > loops changes the random numbers. However, the new versions are > statistically equivalent to yours. > > tab[nx_, ny_, frac_] := Table[If[Random[] < frac, 1, 0], {nx}, {ny}] > > nx = 25; ny = 50; frac1 = 0.1; frac2 = 0.5; > p1 = 0.4; p2 = 0.2; > tabrect = tab[nx, ny, frac1]; > tabsq = tab[ny, ny, frac2]; > > testnocomp[mat1_, mat2_, n1_, n2_, pp1_, pp2_] := Module[ > {tmp, sum, val}, > tmp = mat2; Do[sum = mat1[[k,j]] + mat2[[k,i]]; > val = Which[sum == 2, If[Random[] < pp1, 1, tmp[[i,j]]], > sum == 1, If[Random[] < pp2, 0, tmp[[i,j]]], > sum == 0, tmp[[i,j]]]; > tmp[[i,j]] = val, > {k, n1}, {i, n2}, {j, n2}]; tmp]; > > Timing[resnc = testnocomp[tabrect, tabsq, nx, ny, p1, p2]; ] > {1.26 Second,Null} > > testnocomp2[mat1_, mat2_, n1_, n2_, pp1_, pp2_] := Module[{tmp, sum}, > Table[sum = mat1[[All,j]] + mat2[[Range@n1,i]]; tmp = mat2[[i,j]]; > Do[Which[sum[[k]] == 2, If[Random[] < pp1, tmp = 1], > sum[[k]] == 1, If[Random[] < pp2, tmp = 0]],{k, n1}]; > tmp, {i, n2}, {j, n2}]]; > > SeedRandom[1]; > Timing[resnc2 = testnocomp2[tabrect, tabsq, nx, ny, p1, p2]; ] > {0.67 Second,Null} > > testnocomp3[mat1_, mat2_, n1_, n2_, pp1_, pp2_] := Module[{tmp}, > Table[tmp = mat2[[i,j]]; > Scan[Which[# == 2, If[Random[] < pp1, tmp = 1], > # == 1, If[Random[] < pp2, tmp = 0]]&, > mat1[[All,j]] + mat2[[Range@n1,i]]]; > tmp, {i, n2}, {j, n2}]]; > > SeedRandom[1]; > Timing[resnc3 = testnocomp3[tabrect, tabsq, nx, ny, p1, p2]; ] > {0.57 Second,Null} > > testnocomp4[mat1_, mat2_, n1_, n2_, pp1_, pp2_] := > Table[Fold[Which[#2 == 2, If[Random[] < pp1, 1, #1], > #2 == 1, If[Random[] < pp2, 0, #1], > True, #1]&, > mat2[[i,j]], mat1[[All,j]] + mat2[[Range@n1,i]]], > {i, n2}, {j, n2}]; > > SeedRandom[1]; > Timing[resnc4 = testnocomp4[tabrect, tabsq, nx, ny, p1, p2]; ] > {0.09 Second,Null} > > resnc2 === resnc3 === resnc4 > True thanks everybody! the fastest among suggestions is only a factor of 10 slower than a compiled C version. i'd say this is not a bad result at all, but i'll keep experimenting. i'm studying the code you all posted (functional programming and #& are really powerful, but i still need to absorb them) thanks again, Sandro