Re: BinCounts to InterpolatingFunction

*To*: mathgroup at smc.vnet.net*Subject*: [mg109484] Re: BinCounts to InterpolatingFunction*From*: "Kurt TeKolste" <tekolste at fastmail.net>*Date*: Thu, 29 Apr 2010 02:53:38 -0400 (EDT)

If I understand this algorithm: it would seem that it will feed all of the counts for all of the bins into Interpolation. If this is correct, read on. One of the problems in dealing with multidimensional data is that it takes quite large samples to fill in the huge multidimensional volume. In other words, it is hard to get bins fine enough in all dimensions and without having almost all of your bin counts be zero. I suspect that the interpolation will not be very satisfying unless your sample size is huge or you only need relatively course bins. Note the dividing each of four dimensions into 20 bins is already 160,000 bins with an average probability that a randomly chosen sample will be in any particular bin of 1/160000 = 4x10^-6. It takes a long time for the montecarlo to look like a real distribution ... I am not an expert in this area, but I would be tempted to use only the bins with non-zero values. I recall reading about some techniques for dealing with this -- something about trying to sample where the density is highest -- but do not recall the reference. Also, if you start with an a priori distribution rather than trying to construct the distribution based solely on data you have more tools available. ekt On Tue, 27 Apr 2010 08:48 -0400, "dh" <dh at metrohm.com> wrote: > On 27.04.2010 10:06, Kevin J. McCann wrote: > > I am using a Markov Chain Monte Carlo (MCMC) approach to evaluate a > > multidimensional probability density function. The output is a large > > number of multidimensional points {x1,x2,...,xn}. I can use BinCounts to > > gather the points into a PDF (after appropriate normalization). I would > > like to then define a function, p[X_], which is the multidimensional > > interpolation of the BinCounts output, but I can't figure out how to > > automate this for an arbitrary number of dimensions. > > > > Any ideas? > > > > For the 2d case I did the following: > > > > tbl = Partition[ > > Flatten[Table[{xmin + i*\[CapitalDelta]x + \[CapitalDelta]x/2, > > ymin + j*\[CapitalDelta]y + \[CapitalDelta]y/2, > > counts[[i + 1, > > j + 1]]/(\[ScriptCapitalN] \[CapitalDelta]x \ > > \[CapitalDelta]y)}, {i, 0, nx - 1}, {j, 0, ny - 1}]], 3]; > > > > f=Interpolation[tbl] > > > > But as you can see, this is not easily extended to higher dimensions. > > > > Kevin > > > Hi Kevin, > if I understand correctly, your problem is the generation of a suitable > grid of data points for "Interpolation". > Assume you have a function bins[{i1,i2,..,in}] of n integer arguments. > The arguments run from 0..ni. The vector of ni is called > bounds={n1,n2..nn}. We can now define the function "dataGrid" that > creates a rectangular multidimensional structure for the input to > Interpolation: > > dataGrid[bins_, bounds_] := Module[{iter}, > iter = {x, 0, n - 1} /. > Table[{x -> Symbol["x" <> ToString[i]], n -> bounds[[i]]}, {i, 1, > Length[bounds]}]; > Flatten[ > Table[{iter[[All, 1 ]], bins[iter[[All, 1 ]]]}, > Evaluate[Sequence @@ iter]] > , Length[bounds] - 1] > ] > > If we choose an example for bins: > bins[v : {_ ..}] := Times @@ v; > we can calulation an interpolation: > > bins[v : {_ ..}] := Times @@ v; > Interpolation@dataGrid[bins, {4, 4, 4}] > > cheers, Daniel > > -- > > Daniel Huber > Metrohm Ltd. > Oberdorfstr. 68 > CH-9100 Herisau > Tel. +41 71 353 8585, Fax +41 71 353 8907 > E-Mail:<mailto:dh at metrohm.com> > Internet:<http://www.metrohm.com> > > >

**Follow-Ups**:**Re: BinCounts to InterpolatingFunction***From:*DrMajorBob <btreat1@austin.rr.com>