Re: Speed up calculating the pair correlation function for
- To: mathgroup at smc.vnet.net
- Subject: [mg103356] Re: [mg103335] Speed up calculating the pair correlation function for
- From: Leonid Shifrin <lshifr at gmail.com>
- Date: Thu, 17 Sep 2009 06:20:15 -0400 (EDT)
- References: <200909160946.FAA12977@smc.vnet.net>
Hi Szabolcs,
You can gain a two-fold speedup by vectorizing the problem:
pcfOneAltComp =
Compile[{{points, _Real, 2}, {origin, _Real,
1}, {dr, _Real}, {rmax, _Real}, {density, _Real}},
Module[{hist},
hist =
BinCounts[
Sqrt[Total[(origin - Transpose@points)^2]], {0, rmax, dr}];
Transpose[{Range[0, rmax - dr, dr] + dr/2,
hist/(Pi (dr^2 + 2 dr Range[0, rmax - dr, dr]) density)}]]];
In fact, Compile here helps very little - it gives a marginal (few percent)
improvement. I would also try to use ParallelMap when you map on origin
points.
Regards,
Leonid
2009/9/16 Szabolcs Horv=E1t <szhorvat at gmail.com>
> Hello,
>
> I would like to calculate the pair correlation function normalized to 1
> for some 2D point data. I.e. I need to find the mean density of points
> at distance r from any point, normalized to 1.
>
> I am looking for advice on speeding this up.
>
> This is the current implementation I have:
>
> The pcfOne function calculates the mean density of 'points' at distance
> r from one single point ('origin'), up to 'rmax' in steps of 'dr'.
> 'density' is the average density of all points over the complete region
> (since the shape of the region is unknown to the function, this quantity
> is passed separately):
>
> pcfOne[points_, origin_, dr_, rmax_, density_] :=
> Module[{hist},
> hist = BinCounts[
> With[{v = # - origin}, Sqrt[v.v]] & /@ points,
> {0, rmax, dr}];
> Transpose[
> {Range[0, rmax - dr, dr] + dr/2,
> hist/(Pi (dr^2 + 2 dr Range[0, rmax - dr, dr]) density)}
> ]
> ]
>
> Now we can select a subset of the points, calculate this function for
> all of them and average the results. For randomly distributed points
> the result will be a constant function of value 1 (at least until we get
> too close to the edge of the region):
>
> data = RandomReal[1, {50000, 2}];
>
> ListPlot[
> Mean[
> pcfOne[data, #, 0.05, 0.5, Length[data]] & /@
> Nearest[data, {.5, .5}, 1000]
> ],
>
> PlotRange -> {0, 2}, Axes -> False, Frame -> True
> ]
>
> This runs in 80 seconds on my machine. I would like to use this
> function on datasets of up to 300,000 points and average over more than
> just 1000 points near the middle, say 10000. That would take 60 times
> as long, ~80 minutes, which is way too much.
>
> Is it possible to speed this up significantly?
>
>
- References:
- Speed up calculating the pair correlation function for 2D point data
- From: Szabolcs Horvát <szhorvat@gmail.com>
- Speed up calculating the pair correlation function for 2D point data