Histograms do not concur; why?
- To: mathgroup at smc.vnet.net
- Subject: [mg50535] Histograms do not concur; why?
- From: gilmar.rodriguez at nwfwmd.state.fl.us (Gilmar Rodr?guez Pierluissi)
- Date: Thu, 9 Sep 2004 05:18:29 -0400 (EDT)
- Sender: owner-wri-mathgroup at wolfram.com
Dear Mathematica User Group: I'm attempting to visualize the statistical distribution of a particular data set but,I'm getting unconcurring vistas. The data set in question is generated as follows: In[1]: << Graphics`Graphics`; In[2]: <<DiscreteMath`Combinatorica`; In[3]: Off[General::spell1] The following program gives the Minimal Goldbach Prime Partition Point(p,q) corresponding to an even value n. The program rotates the point (p,q)clockwise, by an angle of Pi/4 radians about the origin, and returns the value "rotated q". ROTMGPPP is an abbreviation for "Rotated Minimal Goldbach Prime Partition Point". In[4]: ROTMGPPP = Compile[{{n, _Integer}}, Block[{rotq}, {Do[If[PrimeQ[n - (p = Prime[i])], Return[p]], {i, PrimePi[n/2], PrimePi[Ceiling[Sqrt[n]]],-1}], {rotq = 0.707107(n - 2p)}}; Return[rotq]], {{ i, _Integer}, {Prime[_], _Integer},{PrimePi[_], _Integer}, {PrimeQ[_], True | False}}]; Next; we produce the first million rotated q's. (Mathematica v 5.0 takes slightly less than an hour to calculate these values on my PC. The computing time might be different on your computer.) : In[5]: A = Table[RPTMGPPP[n],{n, 4, 10^6, 2}]; Plot the set A: In[6]: Plt1=ListPlot[A, PlotStyle ->Hue[0.4], {PlotRange ->All,ImageSize -> 500] Many of the rotated q's are zero in value, so we proceed to isolate the non-zero rotated q's as follows: In[7]: B = Select[A, # != 0, &]; We suspect that the non-zero rotated q's might have a Log-Normal Distribution. This means that if we take the (Natural) logarithm of the non-zero rotated q's; and look at their statistical distribution; this distribution might be Normal; i.e., Gaussian, or "bell shaped": In[8]: data = N[Log[B]]; We plot this data set first, (before producing the corresponding Histogram): In[9]: Plt2 = ListPlot[data, PlotStyle -> Hue[0.6], PlotRange -> {{0,500000}, {0,8}}, ImageSize -> 500] Here is the Bin and Frequency table corresponding to our data set. (Please, inspect the table values to get a feel of what the histogram might look like.) : In[10]: MapIndexed[{Sequence @@ #2, Length[#1]} &, Split[Sort[data]]] //TableForm Next, we build our first histogram of the data set: In[11]: Plt3 = Show[Histogram[data, HistogramCategories -> Range[8], Ticks -> {Transpose[{Range[7] + .5, Range[7]}], Automatic},DisplayFunction -> Identity], PlotRange -> {{1, 8}, All}, AxesOrigin -> {1, 0}, Frame -> True, DisplayFunction -> $DisplayFunction] Indeed, the histogram suggests that our data have a log-normal distribution. Looking cautiously for a second opinion though, we do the following: In[12]: <<Statistics`NormalDistribution`; In[13]: Plt4 = Histogram[data] Compare Plt3 and Plt4. They are very different (to say the least). (Number Theorists in the house are welcomed to explain the nature of the distribution depicted in Plt4). The following plots are also fascinating. It seems that our data belongs to two different populations (or partitions) but what are them? : In[14]: freq = MapIndexed[{Sequence @@ #2, Length[#1]} &, Split[Sort[data]]]; In[15]: <<Graphics`Graphics` In[16]: Plt5 = LogLinearListPlot[freq, PlotRange -> All,ImageSize->500] In[17]: Plt6 = LogLogListPlot[freq, PlotRange->All,ImageSize->500] Comments about the (technical) differences between Plt3 and Plt4, as well as the Number Theory or statistical nature of our data set are welcomed! You can also down load a notebook containing the above input lines, by double-clicking the following shortcut: http://gilmarlily.netfirms.com/download/histogram.nb I also built an Excel spreadsheet plot using: In[18]: SetDirectory["C:\Temporary"] In[19]: Export["C:\\Temporary\\frequency.txt", freq, "Table"] Open the file frequency.txt in Excel, and build a vertical bar chart of column b. Or; to download the spreadsheet, simply double-click the following shortcut: http://gilmarlily.netfirms.com/download/histogram.xls Comments long after this message is posted are also welcomed. Thanks!