Re: Data binning with CUDA
- To: mathgroup at smc.vnet.net
- Subject: [mg123003] Re: Data binning with CUDA
- From: "Oleksandr Rasputinov" <oleksandr_rasputinov at hmamail.com>
- Date: Sun, 20 Nov 2011 05:36:07 -0500 (EST)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- References: <ja8506$ig4$1@smc.vnet.net>
On Sat, 19 Nov 2011 11:47:18 -0000, psycho_dad <s.nesseris at gmail.com> wrote: > Hi all, > > Is it possible to bin data (actually I am only interested in the number > of > points in each bin) by using only the CUDA functions available in > Mathematica (CUDAMap etc)? The reason for this is that CUDAFunctionLoad > for some reason is not working, so I have to use only what's built-in. > > Important note: The data I want to apply this on are on the order of > 10^6, they are in the range [0,1] and I want to find the number of > points in bins of size 0.01. > > [...] > > So, what I find is that in terms of speed (unsurpisingly) BinCounts is > faster that HistogramList which is faster than MyCudaBin. > > Obviously, my implementation sucks and that's why I ask for help! > > Any help is appreciated!!! > > Cheers, > Savvas > Although I don't have CUDA-capable hardware so I can't suggest an alternative implementation, I would strongly suspect that for only 10^6 points a CPU-based approach will always be faster than a CUDA one no matter how optimised the code might be. The situation will likely be different for 10^7 points or with simultaneous processing of many lists of 10^6 points, but for such small inputs as you suggest above the processing time will be dominated by latency rather than throughput--and copying small arrays to the graphics card and back several times is inherently a fairly high-latency operation.