       Re: BinCounts to InterpolatingFunction

• To: mathgroup at smc.vnet.net
• Subject: [mg109484] Re: BinCounts to InterpolatingFunction
• From: "Kurt TeKolste" <tekolste at fastmail.net>
• Date: Thu, 29 Apr 2010 02:53:38 -0400 (EDT)

```If I understand this algorithm: it would seem that it will feed all of
the counts for all of the bins into Interpolation.  If this is correct,

One of the problems in dealing with multidimensional data is that it
takes quite large samples to fill in the huge multidimensional volume.
In other words, it is hard to get bins fine enough in all dimensions and
without having almost all of your bin counts be zero.

I suspect that the interpolation will not be very satisfying unless your
sample size is huge or you only need relatively course bins.  Note the
dividing each of four dimensions into 20 bins is already 160,000 bins
with an average probability that a randomly chosen sample will be in any
particular bin of 1/160000 = 4x10^-6.  It takes a long time for the
montecarlo to look like a real distribution ...

I am not an expert in this area, but I would be tempted to use only the
bins with non-zero values.  I recall reading about some techniques for
dealing with this -- something about trying to sample where the density
is highest -- but do not recall the reference.  Also, if you start with
an a priori distribution rather than trying to construct the
distribution based solely on data you have more tools available.

ekt

On Tue, 27 Apr 2010 08:48 -0400, "dh" <dh at metrohm.com> wrote:
> On 27.04.2010 10:06, Kevin J. McCann wrote:
> > I am using a Markov Chain Monte Carlo (MCMC) approach to evaluate a
> > multidimensional probability density function. The output is a large
> > number of multidimensional points {x1,x2,...,xn}. I can use BinCounts to
> > gather the points into a PDF (after appropriate normalization). I would
> > like to then define a function, p[X_], which is the multidimensional
> > interpolation of the BinCounts output, but I can't figure out how to
> > automate this for an arbitrary number of dimensions.
> >
> > Any ideas?
> >
> > For the 2d case I did the following:
> >
> > tbl = Partition[
> >      Flatten[Table[{xmin + i*\[CapitalDelta]x + \[CapitalDelta]x/2,
> >         ymin + j*\[CapitalDelta]y + \[CapitalDelta]y/2,
> >         counts[[i + 1,
> >          j + 1]]/(\[ScriptCapitalN] \[CapitalDelta]x \
> > \[CapitalDelta]y)}, {i, 0, nx - 1}, {j, 0, ny - 1}]], 3];
> >
> > f=Interpolation[tbl]
> >
> > But as you can see, this is not easily extended to higher dimensions.
> >
> > Kevin
> >
> Hi Kevin,
> if I understand correctly, your problem is the generation of a suitable
> grid of data points for "Interpolation".
> Assume you have a function bins[{i1,i2,..,in}] of n integer arguments.
> The arguments run from 0..ni. The vector of ni is called
> bounds={n1,n2..nn}. We can now define the function "dataGrid" that
> creates a rectangular multidimensional structure for the input to
> Interpolation:
>
> dataGrid[bins_, bounds_] := Module[{iter},
>    iter = {x, 0, n - 1} /.
>      Table[{x -> Symbol["x" <> ToString[i]], n -> bounds[[i]]}, {i, 1,
>        Length[bounds]}];
>    Flatten[
>     Table[{iter[[All, 1 ]], bins[iter[[All, 1 ]]]},
>      Evaluate[Sequence @@ iter]]
>     , Length[bounds] - 1]
>    ]
>
> If we choose an example for bins:
> bins[v : {_ ..}] := Times @@ v;
> we can calulation an interpolation:
>
> bins[v : {_ ..}] := Times @@ v;
> Interpolation@dataGrid[bins, {4, 4, 4}]
>
> cheers, Daniel
>
> --
>
> Daniel Huber
> Metrohm Ltd.
> Oberdorfstr. 68
> CH-9100 Herisau
> Tel. +41 71 353 8585, Fax +41 71 353 8907
> E-Mail:<mailto:dh at metrohm.com>
> Internet:<http://www.metrohm.com>
>
>
>

```

• Prev by Date: Re: Context Problem
• Next by Date: Can anyone help?
• Previous by thread: Re: BinCounts to InterpolatingFunction
• Next by thread: Re: BinCounts to InterpolatingFunction