calculating a distribution of separation of two means

• To: mathgroup at smc.vnet.net
• Subject: [mg29356] calculating a distribution of separation of two means
• From: "Philip M. Howe" <pmhowe at lanl.gov>
• Date: Thu, 14 Jun 2001 02:27:31 -0400 (EDT)
• Sender: owner-wri-mathgroup at wolfram.com

```Dear Colleagues,

I have a small problem I have been working on, and would like your
insights and suggestions:   I'd like to calculate two distributions,
and then calculate a measure of how the difference in the means vary
as a function of the nature of the two distributions.

This loads some packages I use.

Needs["Statistics`ContinuousDistributions`"];
Needs["Statistics`NormalDistribution`"];
<<NumericalMath`ListIntegrate`;
<< Graphics`Graphics`
<<Statistics`DescriptiveStatistics`;

Here are two normal distributions, with means 20 and 15, with some
arbitrarily chosen standard deviations.

\[Mu]1 = 20;
\[Sigma]1 = .1;
\[Mu]2 = 15;
\[Sigma]2 = 1;

d1 = NormalDistribution[\[Mu]1, \[Sigma]1];
d2 = NormalDistribution[\[Mu]2, \[Sigma]2];

Here is a distribution of the differences.  A histogram can be made
of the distribution of the differences. The histogram would have to
be normalized in order to use it as a probability density function.

Clear[h1,d3,div];
n=1000;
d3=(RandomArray[d1,n]-RandomArray[d2,n]);

ListPlot[d3];
h1=Histogram[d3];

Below, I calculate the distribution of the means in the following
way:  I sample the two distributions n times, and calculate n
differences (d3). Naturally, I want to use values of n  in the
multiple thousands, but I used a small n and left off the semicolons
for you to see what the output looks like.

Clear[h1,j,d3,t1,t2,t3];
n=10;
d3=(RandomArray[d1,n]-RandomArray[d2,n])

I then place these differences into n equal sized bins (t1). This is
a very slow step when n is large. The inefficiency is at least
partially due to the generation of lots of empty lists.  I don't know
how to get around it. Suggestions are welcome.

t1=Table[{Select[d3,(n-1<#<n)&],n},{n,1,Length[d3]}]

I now add up all the values within a bin, and calculate the mean
value.  I do that for each bin (t2). There is a nice side effect, in
that  this eliminates all those empty lists.  They are still present
with zeros in them, so I haven't gained much except, perhaps,
esthetics.

t2=Table[{t1[[j]][[2]],Plus@@t1[[j]][[1]]},{j,1,Length[t1]}];
div=ListIntegrate[t2,1]

I now normalize the table, to be able to treat it as a probability
density function (t3).

t3 = t2 /. {x_, y_} -> {x, y/div};

pp4 = ListPlot[t3, PlotRange -> {{0, 10}, {0, .40}},
PlotStyle -> PointSize[0.02]];
h3 = Histogram[d3];

margin = \[Mu]1/\[Mu]2 - 1
safetyFactor = \[Mu]1/\[Mu]2
StandardDeviation[d3]
Mean[d3]

The output looks reasonable to me, and I think it is correct
(although, as is probably obvious to you, I am no statistician). My
coding makes the calculation quite slow.  I'd appreciate suggestions
for speed-up.  Also, this calculation must be fairly common, and I
bet there are much better ways of approaching it.  Your input will
be appreciated.  (I'd also appreciate a better way to send questions
like this to you.  For example, I placed (* and *) around all my
comments, thinking that you could just copy and paste everything into
a fresh notebook and run it.  When I did this (I'm running v4.1 on a
MacIntosh G4),  Mathematica picked up some of the (* and *) and
ignored the others.)

Thanks,

Phil
--
Philip M. Howe
Program Manager, Stockpile Surety
Los Alamos National Laboratory

(505) 665-5332
(505) 667-9498
Fax: 505-665-5249
email pmhowe at lanl.gov
Mail Stop F606

```

• Prev by Date: Re: display factored form
• Next by Date: AMD vs. Intel Floating Point
• Previous by thread: Re: pattern matching quirks
• Next by thread: Re: calculating a distribution of separation of two means