       Re: question: fitting a distribution from quantiles

• To: mathgroup at smc.vnet.net
• Subject: [mg126471] Re: question: fitting a distribution from quantiles
• From: Darren Glosemeyer <darreng at wolfram.com>
• Date: Sat, 12 May 2012 04:57:38 -0400 (EDT)
• Delivered-to: l-mathgroup@mail-archive0.wolfram.com
• References: <201205110414.AAA23695@smc.vnet.net>

```On 5/10/2012 11:14 PM, László Sándor wrote:
> Hi all,
>
> I have a project (with Mathematica 8) where the first step would be to get the distribution describing my "data" which actually only have quantiles (or worse: frequencies for arbitrary bins). EstimatedDistribution[] looks promising, but I don't know how to feed in this kind of data. Please let me know if you know a fast way.
>
> Thank!
>
>

There isn't enough information in your data for the types of estimation
done by EstimatedDistribution.

The type of information you have in your data would lend itself well to
a least squares fit to the cdf of the distribution. As an example, let's
take this data:

In:= data = BlockRandom[SeedRandom;

We can use Min and Max to see the range of values and then bin within
that range to construct cutoff and frequency data.

In:= {Min[data], Max[data]}

Out= {13.7834, 112.429}

Here, xvals are the cutoffs and counts are the bin frequencies.

In:= {xvals, counts} = HistogramList[data, {{0, 15, 20, 50, 100, 120}}]

Out= {{0, 15, 20, 50, 100, 120}, {1, 6, 55, 37, 1}}

We can get the accumulated probabilities as follows.

In:= probs = Accumulate[counts]/Length[data]

1    7   31  99
Out= {---, ---, --, ---, 1}
100  100  50  100

The analogue of your quantile values would be the right endpoints,
Rest[xvals].

In:= quantiles = Rest[xvals]

Out= {15, 20, 50, 100, 120}

Now we can use the quantiles as the x values and the cdf values as the y
values for a least squares fitting to the CDF (parameters may need
starting values in general, but defaults worked fine in this case):

In:= FindFit[Transpose[{quantiles, probs}], CDF[GammaDistribution[a,
b], x], {a, b}, x]

Out= {a -> 5.24009, b -> 8.88512}

Given that we know that the data don't extend to the right limit of a
gamma's support (gammas can be any positive values), we may want to
adjust the cdf values a bit. The following will shift all the cdf values
by 1/(2*numberOfDataPoints) in this particular case:

In:= FindFit[Transpose[{quantiles, probs - 1/(2 Length[data])}],
CDF[GammaDistribution[a, b], x], {a, b}, x]

Out= {a -> 5.3696, b -> 8.73319}

Darren Glosemeyer
Wolfram Research

```

• Prev by Date: Re: Fine control of evaluation
• Next by Date: Re: Fine control of evaluation
• Previous by thread: question: fitting a distribution from quantiles
• Next by thread: Re: question: fitting a distribution from quantiles