Re: Parallel Kit Question: ParallelDot is much more slow than Dot
- To: mathgroup at smc.vnet.net
- Subject: [mg40397] Re: Parallel Kit Question: ParallelDot is much more slow than Dot
- From: Jens-Peer Kuska <kuska at informatik.uni-leipzig.de>
- Date: Fri, 4 Apr 2003 01:21:18 -0500 (EST)
- Organization: Universitaet Leipzig
- References: <b6gn64$ckk$1@smc.vnet.net>
- Reply-to: kuska at informatik.uni-leipzig.de
- Sender: owner-wri-mathgroup at wolfram.com
Hi, parallel commands are usualy slower than serial ones, because you have the overhead for process communication. For your parallel dot command you have to transfer two doubles (16 Byte) to do a multiplication and a addition with double speed. I don't know you CPU type and speed. I would expect, that you CPU can do an addition and a multiplication in 2-4 CPU-cylces and this will need much less time than the transfer of 16 Byte via MathLink. Since the time for sending the data is much larger than the speed gain by the parallel execution, your parallel command must be much slower than the serial version. General it is very difficult to design an algorithm that is faster executed parallel than on a serial machine. The critical question is "how fast can the comunicatione be" and "how can I reduce the data exchange" because a communication via shared memory, pipes or TCP/IP is typical 100-10000 slower than the data exchange on the system bus on the mainboard. To get a speed gain, you have to increase the operation count that every CPU does on the data. It must be much lager than a simple addition/multiplication. Say 5000-10000 CPU cycles per byte - than and only than you will see a speed gain. Don't expect, that any of your parallel commands is faster than the serial execution. It is a fairy-tale that parallel computing is faster than serial execution. The exceptions of this rule need special algorithms and very carefull programming. Regards Jens Denis Areshkin wrote: > > Dear All, > > I appreciate any response on the following problem: > > It appears that ParrallelDot function, which is a > part of Parallel Kit, works two orders of magnitude > slower on my two processor machine than Dot. Below > is the commands used to evaluate dot product in pa- > rallel. I am not sure that the command > > ExportEnvironment[TestMtrx]; > > is necessary in that case (though it doesn't do any > harm). It takes 6 seconds to evaluate > > Dot[TestMtrx,TestMtrx] > > and about half an hour to evaluate > > ParallelDot[TestMtrx,TestMtrx] > > Needs["Parallel`Parallel`"] > > Needs["Parallel`Commands`"] > > ProcIDTable = Table[LaunchSlave["localhost",$mathkernel],{2}]; > > TestMtrx = Table[Random[Real,{-1,1}],{1000},{1000}]; > > ExportEnvironment[TestMtrx]; > > TestDot = ParallelDot[TestMtrx,TestMtrx] > > I have plenty of RAM (4GB), thus this problem can't > be caused by kernels competing for memory space... > > Thank you for your help > > -- > > Denis Areshkin > (919) 513-2424 (office) > (919) 835-1650 (home)