disappointing CUDA speed
- To: mathgroup at smc.vnet.net
- Subject: [mg114143] disappointing CUDA speed
- From: Gianluca Gorni <gianluca.gorni at fastwebnet.it>
- Date: Thu, 25 Nov 2010 05:56:41 -0500 (EST)
Hi,
I have a 1 year old Apple MacBookPro. I installed
the cudadriver_3.1.17_macos and then tried the first
examples in the documentation:
Needs["CUDALink`"]
CUDAQ[]
True
randM = RandomReal[1, {3000, 3000}];
AbsoluteTiming[randM.randM;]
{2.688389,Null}
AbsoluteTiming[CUDADot[randM, randM];]
{7.328353,Null}
Quite a letdown.
Did I do something wrong?
Gianluca
CUDAInformation[]
{1 -> {"Name" -> "GeForce 9400M", "Clock Rate" -> 1100000,
"Compute Capabilities" -> 1.1`, "GPU Overlap" -> 0,
"Maximum Block Dimensions" -> {512, 512, 64},
"Maximum Grid Dimensions" -> {65535, 65535, 1},
"Maximum Threads Per Block" -> 512,
"Maximum Shared Memory Per Block" -> 16384,
"Total Constant Memory" -> 65536, "Warp Size" -> 32,
"Maximum Pitch" -> 2147483647,
"Maximum Registers Per Block" -> 8192, "Texture Alignment" -> 256,
"Multiprocessor Count" -> 2,
"Core Count" -> 16,
"Execution Timeout" -> 1, "Integrated" -> False,
"Can Map Host Memory" -> False, "Compute Mode" -> "Default",
"Texture1D Width" -> 8192, "Texture2D Width" -> 65536,
"Texture2D Height" -> 32768, "Texture3D Width" -> 2048,
"Texture3D Height" -> 2048, "Texture3D Depth" -> 2048,
"Texture2D Array Width" -> 8192, "Texture2D Array Height" -> 8192,
"Texture2D Array Slices" -> 512, "Surface Alignment" -> 256,
"Concurrent Kernels" -> False, "ECC Enabled" -> False,
"Total Memory" -> 265945088}}