RE: Concentration Plot
- To: mathgroup at smc.vnet.net
- Subject: [mg42662] RE: [mg42628] Concentration Plot
- From: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>
- Date: Sat, 19 Jul 2003 03:19:47 -0400 (EDT)
- Sender: owner-wri-mathgroup at wolfram.com
>-----Original Message----- >From: Urijah Kaplan [mailto:uak at sas.upenn.edu] To: mathgroup at smc.vnet.net >Sent: Friday, July 18, 2003 11:25 AM >To: mathgroup at smc.vnet.net >Subject: [mg42662] [mg42628] Concentration Plot > > >Hello, > >I want to make a concentration plot, that is I have a large amount of x-y >coordinates, and the vast majority congregate around three points on the >plane. I would like to show that in a graph, either with a density plot that >would show three dark spots corresponding to the concentration there, or a >3d plot that would have z values corresponding to the number of x-y values >in a small area. I suppose I could just test all the values to see which >"grid" it falls into, and then color that grid the appropriate intensity or >z value. Is this the best way? Is there a more elegant method? >Thank you so >much. > > > --Urijah Kaplan > > Urijah, leaving out questions of performance and elegance, I cannot have an opinion of "best" (== most pleasing ?) unless I get known more from your task. This here is a quick (and dirty) solution, acceptable for the 100'000 points of my test sample: In[2]:= << Statistics`ContinuousDistributions` In[44]:= clusters = {{{0., 0.}, {.1, .1}}, {{1., .75}, {.1, .15}}, {{.7, .6}, {.25, .05}}}; In[45]:= g = Apply[NormalDistribution, Transpose /@ clusters, {2}]; In[46]:= pts = Table[Random /@ g[[Random[Integer, {1, 3}]]], {100000}]; This is the function to detect a "vicinity" to each point: In[10]:= condense[pts_, {c1_, c2_}] := Block[{u, v, q, mx = 0, s}, s[p_] := 0; (q = N[Round[{c1, c2}*#]/{c1, c2}]; s[q] = s[q] + 1; mx = Max[s[q], mx]) & /@ pts; (q = N[Round[{c1, c2}*#]/{c1, c2}]; Prepend[#, (s[q] - 1)/mx]) & /@ pts // Sort ] (the Sort is made to bring the high density points to the foreground with Mathematica's graphics) In[48]:= newpts = condense[pts, {100, 100}]; // Timing Out[48]= {66.275 Second, Null} (Search for better values for {c1,c2} = inverse distance(s) for making "bins") In[49]:= Show[Graphics[{Hue[Sin[Pi/2*First[#]]], PointSize[.001], Point[Rest[#]]} & /@ newpts], Frame -> True] // Timing Out[49]= {21.912 Second, \[SkeletonIndicator]Graphics\[SkeletonIndicator]} The function Sin[Pi/2 #]& makes smaller steps in color for the higher density bins, this is another way to better visualize the core: In[51]:= Show[Graphics[{Hue[Min[1.1*First[#], 1]], PointSize[.001], Point[Rest[#]]} & /@ newpts], Frame -> True] // Timing Out[51]= {10.254 Second, \[SkeletonIndicator]Graphics\[SkeletonIndicator]} So play around with that. Of course this is all done better (and more elaborate) by the specialists. What is to be improved, e.g. is the method to detect vicinities to a point. A better way might be to build a spatial 2d-datastructure (a quad-tree perhaps, or something similar) and then color each point with a function of the number of neigbors (within a given distance, not within a raster, as I did here), or the inverse to the next nearest neigbor. And of course for even bigger samples, performance matters more. And then there is the task to thin out points of the graphics, where this is possible (to improve graphics performance), but does not sacrifice appearance. Good luck! -- Hartmut Wolf