Re: surprising timings for multiplication of diagonalmatrix and full matrix
- To: mathgroup at smc.vnet.net
- Subject: [mg124963] Re: surprising timings for multiplication of diagonalmatrix and full matrix
- From: Peter Pein <petsie at dordos.net>
- Date: Tue, 14 Feb 2012 06:40:45 -0500 (EST)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
Am 13.02.2012 09:45, schrieb Eric Michielssen: > Hi, > > I'd like to right multiply an n x n diagonal matrix (specified by a vector) > by an n x n full matrix. > > I tried 3 ways > > A. Brute force > > B. Using my own routine in which I was hoping to avoid the (supposedly) n^3 > cost of A. > > diagtimesmat[diag_, mat_] := MapThread[Times, {diag, mat}]; > > C. The compiled version of B. > > cdiagtimesmat = > Compile[{{diag, _Real, 1}, {mat, _Real, 2}}, > Module[{}, MapThread[Times, {diag, mat}]], > CompilationTarget -> "C"]; > > I get weird timings. For n=250: > > n=250;nj=100; > asd=Table[RandomReal[],{n},{n}]; > bsd=Table[RandomReal[],{n}]; > x=Timing[Do[ed=DiagonalMatrix[bsd].asd,{nj}]][[1]] > y=Timing[Do[ed=diagtimesmat[bsd,asd],{nj}]][[1]] > z=Timing[Do[ed=cdiagtimesmat[bsd,asd],{nj}]][[1]] > > Out[415]= 0.359 > Out[416]= 6.037 > Out[417]= 0.141 > > My own uncompiled routine is superslow!!! There are warnings about arrays > being unpacked that I did not copy. > > For n=500: > > In[418]:= n = 500; nj = 100; > asd = Table[RandomReal[], {n}, {n}]; > bsd = Table[RandomReal[], {n}]; > x = Timing[Do[ed = DiagonalMatrix[bsd].asd, {nj}]][[1]] > y = Timing[Do[ed = diagtimesmat[bsd, asd], {nj}]][[1]] > z = Timing[Do[ed = cdiagtimesmat[bsd, asd], {nj}]][[1]] > > Out[421]= 2.777 > Out[422]= 1.31 > Out[423]= 1.014 > > This is more reasonable. It remains a bit surprising that the routine that > only touches n^2 numbers is only twice as fast as the one that supposedly > touches n^3 ones. Also, the compiled routine still does not achieve 100 > MFlops on my laptop. > > How can this behavior be explained? What is the fastest way of doing this? > And how about multiplying a full matrix by a diagonal one? > > Thanks! > Eric > > Eric Michielssen > Radiation Laboratory > Department of Electrical Engineering and Computer Science > University of Michigan > 1301 Beal Avenue > Ann Arbor, MI 48109 > Ph: (734) 647 1793 -- Fax: (734) 647 2106 > > > Hi Eric, surprisingly on my machine the uncompiled tabmult[diag_, mat_] := Table[diag[[i]] mat[[i]], {i, Length[diag]}]; is as fast as your compiled code for n=250 and even faster for n=500. Compiling slows this down and using ParallelTable leads to a timing-desaster (with 6 cores). CUDADot[DiagonalMatrix@bsd, asd] is too slow due to the overhead of writing/reading data to/from GPU-memory (GeForce GTX 560 Ti, 1GB). Peter