Services & Resources / Wolfram Forums / MathGroup Archive
-----

MathGroup Archive 2012

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: surprising timings for multiplication of diagonalmatrix and full matrix

  • To: mathgroup at smc.vnet.net
  • Subject: [mg124963] Re: surprising timings for multiplication of diagonalmatrix and full matrix
  • From: Peter Pein <petsie at dordos.net>
  • Date: Tue, 14 Feb 2012 06:40:45 -0500 (EST)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com

Am 13.02.2012 09:45, schrieb Eric Michielssen:
> Hi,
>
> I'd like to right multiply an n x n diagonal matrix (specified by a vector)
> by an n x n full matrix.
>
> I tried 3 ways
>
> A. Brute force
>
> B. Using my own routine in which I was hoping to avoid the (supposedly) n^3
> cost of A.
>
> diagtimesmat[diag_, mat_] := MapThread[Times, {diag, mat}];
>
> C. The compiled version of B.
>
> cdiagtimesmat =
>    Compile[{{diag, _Real, 1}, {mat, _Real, 2}},
>     Module[{}, MapThread[Times, {diag, mat}]],
>     CompilationTarget ->  "C"];
>
> I get weird timings.  For n=250:
>
> n=250;nj=100;
> asd=Table[RandomReal[],{n},{n}];
> bsd=Table[RandomReal[],{n}];
> x=Timing[Do[ed=DiagonalMatrix[bsd].asd,{nj}]][[1]]
> y=Timing[Do[ed=diagtimesmat[bsd,asd],{nj}]][[1]]
> z=Timing[Do[ed=cdiagtimesmat[bsd,asd],{nj}]][[1]]
>
> Out[415]= 0.359
> Out[416]= 6.037
> Out[417]= 0.141
>
> My own uncompiled routine is superslow!!!  There are warnings about arrays
> being unpacked that I did not copy.
>
> For n=500:
>
> In[418]:= n = 500; nj = 100;
> asd = Table[RandomReal[], {n}, {n}];
> bsd = Table[RandomReal[], {n}];
> x = Timing[Do[ed = DiagonalMatrix[bsd].asd, {nj}]][[1]]
> y = Timing[Do[ed = diagtimesmat[bsd, asd], {nj}]][[1]]
> z = Timing[Do[ed = cdiagtimesmat[bsd, asd], {nj}]][[1]]
>
> Out[421]= 2.777
> Out[422]= 1.31
> Out[423]= 1.014
>
> This is more reasonable. It remains a bit surprising that the routine that
> only touches n^2 numbers is only twice as fast as the one that supposedly
> touches n^3 ones.  Also, the compiled routine still does not achieve 100
> MFlops on my laptop.
>
> How can this behavior be explained?  What is the fastest way of doing this?
> And how about multiplying a full matrix by a diagonal one?
>
> Thanks!
> Eric
>
> Eric Michielssen
> Radiation Laboratory
> Department of Electrical Engineering and Computer Science
> University of Michigan
> 1301 Beal Avenue
> Ann Arbor, MI 48109
> Ph: (734) 647 1793 -- Fax: (734) 647 2106
>
>
>

Hi Eric,

surprisingly on my machine the uncompiled

tabmult[diag_, mat_] := Table[diag[[i]] mat[[i]], {i, Length[diag]}];

is as fast as your compiled code for n=250 and even faster for n=500.
Compiling slows this down and using ParallelTable leads to a 
timing-desaster (with 6 cores).
CUDADot[DiagonalMatrix@bsd, asd] is too slow due to the overhead of 
writing/reading data to/from GPU-memory (GeForce GTX 560 Ti, 1GB).

Peter



  • Prev by Date: Re: MultipleListPlot
  • Next by Date: more questions about NIntegrate
  • Previous by thread: Re: surprising timings for multiplication of diagonalmatrix and full matrix
  • Next by thread: Re: surprising timings for multiplication of diagonalmatrix and full matrix