Re: Data binning with CUDA

*To*: mathgroup at smc.vnet.net*Subject*: [mg123003] Re: Data binning with CUDA*From*: "Oleksandr Rasputinov" <oleksandr_rasputinov at hmamail.com>*Date*: Sun, 20 Nov 2011 05:36:07 -0500 (EST)*Delivered-to*: l-mathgroup@mail-archive0.wolfram.com*References*: <ja8506$ig4$1@smc.vnet.net>

On Sat, 19 Nov 2011 11:47:18 -0000, psycho_dad <s.nesseris at gmail.com> wrote: > Hi all, > > Is it possible to bin data (actually I am only interested in the number > of > points in each bin) by using only the CUDA functions available in > Mathematica (CUDAMap etc)? The reason for this is that CUDAFunctionLoad > for some reason is not working, so I have to use only what's built-in. > > Important note: The data I want to apply this on are on the order of > 10^6, they are in the range [0,1] and I want to find the number of > points in bins of size 0.01. > > [...] > > So, what I find is that in terms of speed (unsurpisingly) BinCounts is > faster that HistogramList which is faster than MyCudaBin. > > Obviously, my implementation sucks and that's why I ask for help! > > Any help is appreciated!!! > > Cheers, > Savvas > Although I don't have CUDA-capable hardware so I can't suggest an alternative implementation, I would strongly suspect that for only 10^6 points a CPU-based approach will always be faster than a CUDA one no matter how optimised the code might be. The situation will likely be different for 10^7 points or with simultaneous processing of many lists of 10^6 points, but for such small inputs as you suggest above the processing time will be dominated by latency rather than throughput--and copying small arrays to the graphics card and back several times is inherently a fairly high-latency operation.