MathGroup Archive: November 2010 [00517]

[Date Index] [Thread Index] [Author Index]
Re: Mathematica 8
To: mathgroup at smc.vnet.net
Subject: [mg114024] Re: Mathematica 8
From: Syd Geraghty <sydgeraghty at me.com>
Date: Sat, 20 Nov 2010 18:26:55 -0500 (EST)
Hi Bill,

On Nov 20, 2010, at 3:11 AM, Bill Rowe wrote:

> On 11/19/10 at 5:07 AM, sydgeraghty at me.com (Syd Geraghty) wrote:
>
>> So lets look for some data to support the buying decision! Using
>> MathematicaBenchmark8 with my current machine:
>
>> Machine Name:          sydsmacbookpro
>> System:                MacOS X V 10.6.5 Snow Leopard (64-bit)
>> Date:                  November 15, 2010
>> Mathematica Version:   8.0.0
>> Benchmark Result:      0.41
>
> I see bit better with my MacBookPro, an overall score of 0.60.

Do you have 4 GB RAM? I only have 2 GB.
>
>> The interesting fact is if I set up Mathematica 8 to use both of my
>> MacBooks cores to I get a 61% MathematicaBenchmark8 improvement (an
>> impressive result).
>
> Could you elaborate a bit on this? I thought to make use of more
> than one core I needed to use commands like Parallelize. Are
> modifying the benchmarking code?

No ... I just selected to launch parallel kernels on startup of Mathematica 8 in Mathematica Preferences.

Then I just ran MathematicaMark8 with no code modifications to see what happened.

One of my points is that I am sure WRI could produce some really great performance improvement numbers for Mathematica8 if it tried to optimize hardware and software and compare that to an optimized Mathematica7 capability.


>
>> So the disappointing first comparison from my current system (best
>> result of .66 using Mathematica 8 to the best MathematicaBenchmark8
>> result of 1.0) apparently limits my available upgrade in performance
>> to 52%.
>
> Why do you conclude this? I assume you are making this estimate
> from the benchmark data included with the current benchmarking
> code. If so, it seems to me that is merely the best Wolfram
> happened to test. Not necessarily the best that can be done with
> Apple hardware.

Yes I agree completely. But my point below is that this is the best information on comparative performance I can get with MathematicaMark8 (or online) without embarking on my own benchmarking programme.

I believe it should be a priority for WRI to do a more complete technical marketing job (particularly for Mathematica 8) which has some significant system performance enhancements. The GPU support, multi-core & multi-threading support, and ability to output C Code are all potentially radical performance improvements. But how to achieve all this potential is the interesting question for me.

Now there is a white paper covering CUDA support available at:

 http://www.wolfram.com/products/mathematica-cuda-free-white-paper.en.html

However this is only a small part of the story of how a user can optimize Mathematica8 performance with todays available supported hardware.

Also I believe it would be very helpful for WRI to publish a technical roadmap for the future of Mathematica and its hardware integration to guide the user community.

Up till now the roadmap has apparently been in the heads of a few very smart people at WRI. So far that has produced astonishing results. But there is clearly a lot WRI could do to address hardware integration, performance, and system coherency issues.

>> I hope someone at WRI will recognize the importance of totally
>> upgrading benchmarking to take into account support for GPUs (and
>> address the CUDA vs OpenCL issues) and parallelism (multi-core,
>> multi-thread) support. Without a serious benchmark upgrade I fear
>> the general discussion will not lead to actionable information.
>
> I think the issue isn't just the benchmark. Having benchmark
> code that addresses CUDA etc, doesn't really help you get more
> performance without understanding how to configure things
> optimally. If parallel computing happens transparently to the
> user then there isn't an issue. But this doesn't seem to be the
> case now.

I totally agree:

This posted reply to a recent WRI Blog ( http://blog.wolfram.com/2010/09/20/tapping-into-the-power-of-gpu-in-mathematica/ ) expresses some of the issues well:

	"So Jan and Lubos hint at the issue that is of more interest to most of us. It's nice to have some sort of support for GPUs in Mathematica, but this is somewhat specialized, and what most of us care about much more is much better utilization of the parallelism in existing machines for something other than dense machine precision linear algebra and FFTs.

Which raises the real question. Obviously, at a 1000 mile level, Mathematica provides a perfect environment for programming GPUs express what you want via Table or Map or some similar such primitive, and have the system run each sub-calculation in parallel on one of the GPUs execution engines.
Of course it's not quite that trivial, because there is the problem of exactly what each of those sub-calculations looks like, and (right now) they usually boil down, eventually, to pattern matching inside the Mathematica Kernel, which has not (so far?) been parallelized.

So the real issue of interest to us is: to get this to work, did you simply slap on a band-aid (somewhat like the ParallelMap/ParallelTable/etc), or did you do the low level work to make sure that most reasonable Mathematica code will be dramatically sped up?

AND if that is the case, does this mean we will see this same technology applied to the rest of Mathematica??

Even apart from parallelizing the kernel (admittedly a hard problem) there is a lot of low lying fruit that, even in 7, was not being picked.

Obvious examples include

- using SSE/AVX when there will be a lot of parallel computation (eg in filling in Table or in numeric integration, or in various plots (esp 3D and density type plots)

- likewise at a higher level using multiple CPUs for numeric integration and plots, and anything that involves searching over a large space (NMinimize, numeric integration) and to generate large quantities of randomness and then random numbers, etc etc

- at the very least, give us a version of ParallelMap etc that all see a SINGLE Mathematica kernel so we don't have to go through this nonsense of exporting the relevant definitions to the other kernels (not to mention the wasted memory).

Obviously changing things for parallelism means that various items will break. Things broke frequently with early upgrades to Mathematic and you know what it was worth it. If the user wants to write weird code that involves the update of a global by every successive call to a function being plotted, or whatever, give them the means to do so  but the bulk of us writing normal code shouldn't have to be held back by such weird edge cases.

So there you are that's what would REALLY excite us. Not being told that we can write to CUDA for some set of problems of interest to but a smal
l fraction of us, but being told that Mathematica 8 is the Parallelism Edition, and that for pretty much anything it will run at a factor of N faster, where N is your number of CPUs, with, for many purposes, a further factor of 2 faster from use of SSE (with a factor of 4 coming with AVX in SandyBridge)."

Posted by Maynard HandleySeptember 21, 2010 at 10:37 pm

>
>> The problem is there is no documentation I have found so far that gives
>> me a comparison of what that means vs going nVidia and CUDA.
>
> Yes. This is more of an issue since it is unclear as to how to
> set up a given system for optimal performance.
>
>

Totally agree.

We could do with a white paper from WRI on "how to set up a given system for optimal performance".

Perhaps Mathematica 8 will tell us if I just do an Inline Free-form input query on the subject :-)

Cheers .... Syd

Syd Geraghty B.Sc, M.Sc.

sydgeraghty at mac.com

Mathematica 8.0 for Mac OS X x86 (64-bit) (November 6, 2010)
MacOS X V 10.6.5 Snow Leopard
MacBook Pro 2.33 GHz Intel Core 2 Duo  2GB RAM
Prev by Date: One more rules + evaluation control problem
Next by Date: Re: How to work with In?
Previous by thread: Re: Mathematica 8
Next by thread: Mathematica 8