Re: Fast calculation of pair correlation function

*To*: mathgroup at smc.vnet.net*Subject*: [mg98640] Re: [mg98626] Fast calculation of pair correlation function*From*: Leonid Shifrin <lshifr at gmail.com>*Date*: Wed, 15 Apr 2009 04:58:39 -0400 (EDT)*References*: <200904141019.GAA07830@smc.vnet.net>

Hi Marcus, It seems that there is no way around computing all pair distances, so essentially we need a good old doubly nested loop. The code below may however be faster than the naively implemented nested loop - anyway, the best I could figure: In[1] = Clear[pairDistanceDistribution]; pairDistanceDistribution[ iter : {start_?NumericQ, end_?NumericQ, delta_?NumericQ}, points : {{_?NumericQ, _?NumericQ, _?NumericQ} ..}] := With[{len = Length[points]}, 2/(len*(len - 1))* N[Total[ MapIndexed[ BinCounts[ If[#2 === {len}, {}, Sqrt@Total[(Transpose[Drop[points, First[#2]]] - #)^2] ], iter] &, points]]]]; What is being done here is that for every point, distances to all points after it are computed at once and immediately binned, so that they don't take space. All the binned results are then summed together to form the final distribution. You give this function your parameters (the distance interval and the delta), plus your list of generated 3D points. It returns the accordingly binned distribution. Here are 2000 random points (each of the 3 coordinates is random in the interval (-10,10)): In[2] = positions = RandomReal[{-10, 10}, {2000, 3}]; It takes about 8 sec. on my not too fast 5-years old P IV laptop: In[3] = (distr = pairDistanceDistribution[{0, 40, 1.}, positions]) // Timing Out[3] = {8.252, {0.000504752, 0.00324262, 0.00828264, 0.0149035, 0.0221901, 0.0304032, 0.0385338, 0.046089, 0.0538494, 0.0599565, 0.0653697, 0.0693417, 0.0716418, 0.0727229, 0.0724892, 0.0691691, 0.0650805, 0.0583087, 0.0500475, 0.0406423, 0.0302661, 0.0215503, 0.0144322, 0.00927564, 0.00569185, 0.0032051, 0.00163432, 0.000715358, 0.000301651, 0.000121061, 0.0000325163, 3.50175*10^-6, 1.50075*10^-6, 0., 0., 0., 0., 0., 0., 0.}} The distribution is normalized to 1: In[4] = Total[distr] Out[4] = 1. Internal compilation probably will not help all that much, since we vectorized the problem anyway. The complexity is of course quadratic with the size of your sample, but there seems to be no way around it. For a sample of 10000 points, my machine will then do it in roughly 3-4 minutes, and the modern fast desktop will probably be 3-4 times faster. Finally, it seems that your problem can be parallelized quite well, you may try parallel constructs in Mathematica to further speed it up, if you have a multi-core machine. Regards, Leonid On Tue, Apr 14, 2009 at 3:19 AM, markus <markusg.phys at googlemail.com> wrote: > Hi, > > I have a quite large list (length ~ 10,000) of coordinates (positions > of particles from a simulation), e.g. data={{x1,y1,z1}, > {x2,y2,z2},...}, and I am looking for fast way to calculate the "pair > correlation" or "radial distribution function", which tells me, for > each value of r, how many particles are separated by a distance r (+- > epsilon). > Usually, this can be calculated by something like: > > Do[Do[histogr[[ Ceiling[Norm[ data[[i]]-data[[j]] ] / delta] ]], {j, i > +1, Length[data]}], {i, 1, Length[data]-1}] > > where histogr is just a list whose index corresponds to the distance > r. > > Unfortunately I have found that for lists of length >1,000, this way > of calculating is very slow (it can take ~minutes), compared to the > equivalent C code, which proceeds in a few seconds. I have also tried > to "Compile" the function, but the speed does not increase... > > Does anybody know some fast way of calculating the pair distribution > function in Mathematica? > >

**References**:**Fast calculation of pair correlation function***From:*markus <markusg.phys@googlemail.com>