MathGroup Archive 2009

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Speed up calculating the pair correlation function for

  • To: mathgroup at smc.vnet.net
  • Subject: [mg103356] Re: [mg103335] Speed up calculating the pair correlation function for
  • From: Leonid Shifrin <lshifr at gmail.com>
  • Date: Thu, 17 Sep 2009 06:20:15 -0400 (EDT)
  • References: <200909160946.FAA12977@smc.vnet.net>

Hi Szabolcs,

You can gain a two-fold speedup by vectorizing the problem:

pcfOneAltComp =
  Compile[{{points, _Real, 2}, {origin, _Real,
     1}, {dr, _Real}, {rmax, _Real}, {density, _Real}},
   Module[{hist},
    hist =
     BinCounts[
      Sqrt[Total[(origin - Transpose@points)^2]], {0, rmax, dr}];
    Transpose[{Range[0, rmax - dr, dr] + dr/2,
      hist/(Pi (dr^2 + 2 dr Range[0, rmax - dr, dr]) density)}]]];


In fact, Compile here helps very little - it gives a marginal (few percent)
improvement. I would also try to use  ParallelMap when you map on origin
points.

Regards,
Leonid



2009/9/16 Szabolcs Horv=E1t <szhorvat at gmail.com>

> Hello,
>
> I would like to calculate the pair correlation function normalized to 1
> for some 2D point data.  I.e. I need to find the mean density of points
> at distance r from any point, normalized to 1.
>
> I am looking for advice on speeding this up.
>
> This is the current implementation I have:
>
> The pcfOne function calculates the mean density of 'points' at distance
> r from one single point ('origin'), up to 'rmax' in steps of 'dr'.
> 'density' is the average density of all points over the complete region
> (since the shape of the region is unknown to the function, this quantity
> is passed separately):
>
> pcfOne[points_, origin_, dr_, rmax_, density_] :=
>  Module[{hist},
>   hist = BinCounts[
>     With[{v = # - origin}, Sqrt[v.v]] & /@ points,
>     {0, rmax, dr}];
>   Transpose[
>    {Range[0, rmax - dr, dr] + dr/2,
>     hist/(Pi (dr^2 + 2 dr Range[0, rmax - dr, dr]) density)}
>   ]
>  ]
>
> Now we can select a subset of the points, calculate this function for
> all of them and average the results.  For randomly distributed points
> the result will be a constant function of value 1 (at least until we get
> too close to the edge of the region):
>
> data = RandomReal[1, {50000, 2}];
>
> ListPlot[
>   Mean[
>    pcfOne[data, #, 0.05, 0.5, Length[data]] & /@
>      Nearest[data, {.5, .5}, 1000]
>   ],
>
>   PlotRange -> {0, 2}, Axes -> False, Frame -> True
> ]
>
> This runs in 80 seconds on my machine.  I would like to use this
> function on datasets of up to 300,000 points and average over more than
> just 1000 points near the middle, say 10000.  That would take 60 times
> as long, ~80 minutes, which is way too much.
>
> Is it possible to speed this up significantly?
>
>



  • Prev by Date: Re: NDSolve Mathematica 6 and 7
  • Next by Date: Re: Replace in operators once again
  • Previous by thread: Speed up calculating the pair correlation function for 2D point data
  • Next by thread: Re: Speed up calculating the pair correlation function