[Date Index]
[Thread Index]
[Author Index]
Re: Statistical Analysis & Pattern Matching
*To*: mathgroup at smc.vnet.net
*Subject*: [mg64541] Re: Statistical Analysis & Pattern Matching
*From*: Paul Abbott <paul at physics.uwa.edu.au>
*Date*: Mon, 20 Feb 2006 22:31:26 -0500 (EST)
*Organization*: The University of Western Australia
*References*: <dtc9rl$a5m$1@smc.vnet.net>
*Sender*: owner-wri-mathgroup at wolfram.com
In article <dtc9rl$a5m$1 at smc.vnet.net>, virtualadepts at gmail.com wrote:
> If I have a set of random data, which could be the result of rolling a
> 6 sided die 1,000 times, and the die is favored to rolls 2 numbers more
> often than the others, how do I analyze the data to determine which
> numbers it favors without knowing in advance that it favors any of the
> numbers?
>
> Considering I am looking at random data it is impossible to say if the
> dice favors any number for sure, but I can assume that it favors a
> number and check to see which numbers it would favor if it did.
>
> That is an example of the type of problem I want to solve but I can
> think of others. How about an algorithm that generates random numbers
> between 1 and 1,000,000. Lets say I have a database of 10 million
> numbers it has generated, and want to determine what numbers it favors.
>
> This is not the same question as asking if it is random data, because
> for our purposes it is random. This is just asking if it is more
> likely to produce certain number.
> Lets say for this example that the machine is programmed to never
> produce the same number twice, until it has randomly generated every
> other possible number. Is there a way to predict this is happening by
> looking at the data?
Essentially, you should read up on Maximum Entropy. As an example
application, Michael Kelly <http://www.stuart.iit.edu/faculty/kelly>, a
keen Mathematica user, uses Maximum Entropy and Linear Inversion to
evaluate asset distributions.
I see that you posted this message to several other newsgroups. Martin
Brown's posting on sci.math.num-analysis was particularly relevant:
| Look for Wolf's dice data and Ed Jaynes Bayesian analysis of the
| biases in them. Interesting stuff considering the dies Wolf used were
| the best quality manufacture of their day and he made ISTR 100,000
| throws.
A better link to Rau's paper is http://arxiv.org/pdf/physics/9805024.
To prescribe the relative frequencies of a set of numbers, use the
cumulative frequencies,
cumfreq[x_List] := FoldList[Plus, 0, x]/Tr[x]
to produce an inverse cumulative distribution interpolating function.
icdf[x_List] := icdf[x] = Interpolation[Transpose[
{cumfreq[x], Range[0, Length[x]]}], InterpolationOrder -> 0]
For example,
Plot[icdf[{1, 1, 3, 2, 1, 1}][x], {x, 0, 1}]
After loading the statistics stub
<< Statistics`
here are the sample frequencies of a loaded die with 3 and 4 favoured.
Table[icdf[{1, 1, 3, 2, 1, 1}][Random[]], {1000}];
Frequencies[%]
Cheers,
Paul
_______________________________________________________________________
Paul Abbott Phone: 61 8 6488 2734
School of Physics, M013 Fax: +61 8 6488 1014
The University of Western Australia (CRICOS Provider No 00126G)
AUSTRALIA http://physics.uwa.edu.au/~paul
Prev by Date:
**Re: Re: Counting circles (digital image processing)**
Next by Date:
**Re: Statistical Analysis & Pattern Matching**
Previous by thread:
**Re: Statistical Analysis & Pattern Matching**
Next by thread:
**Re: Statistical Analysis & Pattern Matching**
| |