MathGroup Archive 2009

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Fast calculation of pair correlation function

  • To: mathgroup at
  • Subject: [mg98636] Re: [mg98626] Fast calculation of pair correlation function
  • From: Daniel Lichtblau <danl at>
  • Date: Wed, 15 Apr 2009 04:57:55 -0400 (EDT)
  • References: <>

markus wrote:
> Hi,
> I have a quite large list (length ~ 10,000) of coordinates (positions
> of particles from a simulation), e.g. data={{x1,y1,z1},
> {x2,y2,z2},...}, and I am looking for fast way to calculate the "pair
> correlation" or "radial distribution function", which tells me, for
> each value of r, how many particles are separated by a distance r (+-
> epsilon).
> Usually, this can be calculated by something like:
> Do[Do[histogr[[ Ceiling[Norm[ data[[i]]-data[[j]] ] / delta] ]], {j, i
> +1, Length[data]}], {i, 1, Length[data]-1}]
> where histogr is just a list whose index corresponds to the distance
> r.
> Unfortunately I have found that for lists of length >1,000, this way
> of calculating is very slow (it can take ~minutes), compared to the
> equivalent C code, which proceeds in a few seconds. I have also tried
> to "Compile" the function, but the speed does not increase...
> Does anybody know some fast way of calculating the pair distribution
> function in Mathematica?

I am a bit unclear as to what you ultimately want. maybe a histogram, 
say, with respect to some fixed epsilon value? Or maybe you need to 
evaluate the RDF for various different {r,eps} pairs?

For lists up to 5K or so in size (larger, on 64 bit machines), it would 
make sense to preprocess to get the full list of (sorted) distances. The 
code below seems to be reasonably efficient for finding all distances 
between pairs and sorting.

distancesC = Compile[{{ll,_Real,2}},
   {j,Length[ll]}, {k,Length[ll]}]]]]

Example on a set of 3000 points.

dim = 3;
coords = RandomReal[{-20,20}, {3000,dim}];

In[4]:= Timing[norms = distancesC[coords];]
Out[4]= {11.8582, Null}

For the situation where you may be interested in many distances and 
different epsiolons, I'd recommend a divide-and-conquer search. The code 
below will do this. Note we divide by two because we used each point 
pair twice. Alternatives would be to use Union in distancesC but this 
can cause trouble if you allow multiple "points" at the same coordinates 
(I'm guessing you do not, in which case modify the code as appropriate).

radialDistributionC = Compile[
  {{dlist,_Real,1}, {r,_Real}, {eps,_Real}},
    hi = len;
    If [dlist[[lo]]>top || dlist[[hi]]<bot, Throw[0]];
    mid = Ceiling[(hi+lo)/2.];
    While[dlist[[lo]]<bot && mid>lo,
	 If [dlist[[mid]]<bot,
	  lo = mid; mid = Ceiling[(hi+lo)/2.],
	  mid = Floor[(mid+lo)/2.];
    mid = Ceiling[(hi+lo)/2.];
    While[dlist[[hi]]>top && mid<hi,
	 If [dlist[[mid]]>top,
	  hi = mid; mid = Floor[(hi+lo)/2.],
	  mid = Ceiling[(mid+hi)/2.];

Now we'll find out how many pairs are separated by distance 11.2 +- 0.05.

In[7]:= Timing[radialDistributionC[norms,11.2,.05]]
Out[7]= {0., 6861}

This might be fast enough for your purposes but you will need 
considerable memory to get it to work for sets of 10^4 points. For that 
you might consider splitting into sublists, doing computations like 
those above on sublist pairs, and combining results. Alternatively, 
there might be approaches that use Nearest or other functionality that 
avoids creating an nxn matrix from the n points.

I did try this on a 64 bit machine, with a list of 10^4 points. It took 
around 134 seconds to compute the distance pairs, and a single query was 
as below:

In[6]:= Timing[radialDistributionC[norms,11.2,.05]]
Out[6]= {0.001, 78500}

Daniel Lichtblau
Wolfram Research

  • Prev by Date: Re: Re: Should I be using Mathematica at all?
  • Next by Date: Re: PlotLegends bug?
  • Previous by thread: Re: Fast calculation of pair correlation function
  • Next by thread: Re: Fast calculation of pair correlation function