Mathematica 9 is now available
Services & Resources / Wolfram Forums / MathGroup Archive
-----

MathGroup Archive 2012

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Speed of Mathematica on AMD machines

  • To: mathgroup at smc.vnet.net
  • Subject: [mg126443] Re: Speed of Mathematica on AMD machines
  • From: "Oleksandr Rasputinov" <oleksandr_rasputinov at ymail.com>
  • Date: Fri, 11 May 2012 00:12:08 -0400 (EDT)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com
  • References: <jog00v$g1d$1@smc.vnet.net>

On Thu, 10 May 2012 09:59:11 +0100, <einschlag at gmail.com> wrote:

> We have recently bought an iBuyPower gaming PC for our research group:
>
> AMD FX 8 core, 3.6 GHz, 16 GB RAM
>
> MathematicaMark8 Benchmark 0.86 is not bad, considering the price ~$800  
> of this PC but I was expecting much more.
>
> Apparently Intel's MKL library used by Mathematica is not optimized for  
> AMD processors.
>
> A test program calculating exponentials of large matrices takes 13 s on  
> the AMD PC and only 8 s on my Mac Pro (Mathematica benchmark 0.7) that  
> has 8 Intel Xeon cores at 2.4 GHz. And on my Lenovo laptop the program  
> runs 9 s. I blame it on the MKL inadequacy for AMD.
>
> TestProgram := Module[{},
>   NN = 1000;
>   AMatr = Table[RandomReal[], {i, 1, NN}, {j, 1, NN}];
>   NExec = 10;
>   For[i = 1, i < NExec, i++,
>    MatrixExp[AMatr];
>    ];
>   ]
>
> Execution by iBuyPower PC (AMD FX 8 core, Linux Ubuntu 64 bit)
>
> TestProgram // AbsoluteTiming
>
> {13.230105, Null}
>
> Execution by Mac Pro (Intel Xeon 2 x 4 core)
>
> TestProgram // AbsoluteTiming
>
> {8.126944, Null}
>
> Execution by Lenovo laptop (Intel i7-QM2060, Windows 7 64 bit)
>
> TestProgram // AbsoluteTiming
>
> {9.4275392, Null}
>
>
> On the other hand, a program compiling in C from Mathematica's help runs  
> very fast on the AMD PC:
>
> TestProgram2 := Module[{},
>   c = Compile[ {{x, _Real}, {n, _Integer}},
>     	Module[ {sum, inc}, sum = 1.0; inc = 1.0;
>      Do[inc = inc*x/i; sum = sum + inc, {i, n}]; sum],
>     CompilationTarget -> "C"];
>   c[1.6, 10000000];
>   ]
>
> Execution by iBuyPower PC (AMD FX 8 core, Linux Ubuntu 64 bit, GCC  
> compiler)
>
> TestProgram2 // AbsoluteTiming
>
> {0.114427, Null}
>
> Execution by Mac Pro (Intel Xeon 2 x 4 core, GCC compiler)
>
> TestProgram2 // AbsoluteTiming
>
> {0.212875, Null}
>
> Execution by Lenovo laptop (Intel i7-QM2060, Windows 7 64 bit, Microsoft  
> Visual C++)
>
> TestProgram2 // AbsoluteTiming
>
> {0.3540203, Null}
>
> It seems the second test program is not using MKL and thus AMD becomes  
> very efficient.
>
> I will continue testing.
>
> Is there any way to improve Mathematica's performance on AMD machines?
>
> Dmitry
>

In the past, Intel had been known to engage in anticompetitive practices  
with respect to AMD, and quite rightly was subject to legal penalties for  
this. (Specifically, they encouraged large computer manufacturers such as  
Dell to take up exclusive supply contracts by means of large discounts and  
availability guarantees.) As a result of this judgment there has been a  
lot of general hysteria that Intel may still be discriminating against AMD  
performance-wise in their library and compiler products, which has  
culminated in legal threats resulting in the large disclaimers posted all  
over Intel's products stating that they are not meant for anything other  
than Intel processors.

Suspicion and disclaimers are one thing, but actual performance is  
another. As you may be aware, AMD offers their own math library, ACML.  
What most people who level this criticism of MKL are not aware of,  
however, is that MKL actually performs better than ACML, *even on AMD  
processors*. So, even if it is not optimized as thoroughly as it might be  
for AMD processors (which is more than likely the case; Intel does not  
have an infinite development budget and there is no financial incentive  
for them to go to great lengths optimizing for other manufacturers'  
processors, which have performance characteristics very different to their  
own), MKL is still better than the alternatives.

Now, how then to explain the poor performance you observe? Unfortunately,  
the latest generation of AMD processors are simply not very good (the  
Bulldozer processors are actually worse than the previous-generation  
Phenom II processors in many applications), whereas Intel's products have  
been making dramatic gains lately despite AMD's reduced competitiveness.  
The end result is that a Bulldozer core is "worth" about half a Sandy  
Bridge core, clock for clock, especially in floating-point workloads since  
a single FP unit is shared between two of what AMD calls cores (indeed,  
many have said that AMD's "8 core" processors are more correctly referred  
to as genuinely having 4 cores due to much shared apparatus, but for  
marketing reasons, AMD is obviously not buying that argument). In regard  
to your results from TestProgram2: sorry to say, these are invalid because  
the time taken to compile to C completely overwhelms the actual runtime,  
and you include both in the assessment, as well as using AbsoluteTiming  
which is not appropriate for single-threaded code with short runtimes  
executing inside the Mathematica kernel. A more valid test is:

c = Compile[{{x, _Real}, {n, _Integer}},
    Module[{sum, inc}, sum = 1.0; inc = 1.0;
    Do[inc = inc*x/i; sum = sum + inc, {i, n}]; sum],
    CompilationTarget -> "C"
  ];

Do[c[1.6, 10000000], {10}] // Timing

which on my computer (Intel Core 2, 3.2GHz) takes about 0.65 seconds, i.e.  
65 ms for a single evaluation of c[1.6, 10000000].

Your matrix exponential test would also be better posed as:

NN = 1000;
mat = RandomReal[{0, 1}, {NN, NN}];
Do[MatrixExp[mat], {10}] // AbsoluteTiming

(I get 9.5 seconds.)

However I would be reluctant to draw any firm conclusion from these tests  
if I were you. Far better to look at published benchmarks for real  
applications, for instance:

http://techreport.com/articles.x/21813/15

or

http://www.anandtech.com/show/4955/the-bulldozer-review-amd-fx8150-tested/7

which both show that Bulldozer performance is a very mixed bag in general.  
While there are a few applications in which it can match or only just  
outperform Intel's offerings, for the most part it falls behind them  
considerably.

Best,

O. R.



  • Prev by Date: Re: File size 5.5MB
  • Next by Date: Re: Speed of Mathematica on AMD machines
  • Previous by thread: Speed of Mathematica on AMD machines
  • Next by thread: Re: Speed of Mathematica on AMD machines