MathGroup Archive 2003

[Date Index] [Thread Index] [Author Index]

Search the Archive

RE: Concentration Plot

  • To: mathgroup at smc.vnet.net
  • Subject: [mg42662] RE: [mg42628] Concentration Plot
  • From: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>
  • Date: Sat, 19 Jul 2003 03:19:47 -0400 (EDT)
  • Sender: owner-wri-mathgroup at wolfram.com

>-----Original Message-----
>From: Urijah Kaplan [mailto:uak at sas.upenn.edu]
To: mathgroup at smc.vnet.net
>Sent: Friday, July 18, 2003 11:25 AM
>To: mathgroup at smc.vnet.net
>Subject: [mg42662] [mg42628] Concentration Plot
>
>
>Hello,
>
>I want to make a concentration plot, that is I have a large amount of x-y
>coordinates, and the vast majority congregate around three points on the
>plane. I would like to show that in a graph, either with a density plot
that
>would show three dark spots corresponding to the concentration there, or a
>3d plot that would have z values corresponding to the number of x-y values
>in a small area. I suppose I could just test all the values to see which
>"grid" it falls into, and then color that grid the appropriate intensity or
>z value. Is this the best way? Is there a more elegant method? 
>Thank you so
>much.
>
>
>        --Urijah Kaplan
>
>

Urijah,

leaving out questions of performance and elegance, I cannot have an opinion
of "best" (== most pleasing ?) unless I get known more from your task.

This here is a quick (and dirty) solution, acceptable for the 100'000 points
of my test sample:


In[2]:= << Statistics`ContinuousDistributions`
In[44]:=
clusters = {{{0., 0.}, {.1, .1}}, {{1., .75}, {.1, .15}}, {{.7, .6}, {.25,
.05}}};

In[45]:=
g = Apply[NormalDistribution, Transpose /@ clusters, {2}];

In[46]:=
pts = Table[Random /@ g[[Random[Integer, {1, 3}]]], {100000}];





This is the function to detect a "vicinity" to each point:

In[10]:=
condense[pts_, {c1_, c2_}] := Block[{u, v, q, mx = 0, s},
    s[p_] := 0;
    (q = N[Round[{c1, c2}*#]/{c1, c2}]; s[q] = s[q] + 1; 
          mx = Max[s[q], mx]) & /@ pts;
    (q = N[Round[{c1, c2}*#]/{c1, c2}]; Prepend[#, (s[q] - 1)/mx]) & /@ pts
//
       Sort
    ]

(the Sort is made to bring the high density points to the foreground with
Mathematica's graphics)


In[48]:=
newpts = condense[pts, {100, 100}]; // Timing
Out[48]= {66.275 Second, Null}

(Search for better values for {c1,c2} = inverse distance(s) for making
"bins")


In[49]:=
Show[Graphics[{Hue[Sin[Pi/2*First[#]]], PointSize[.001], Point[Rest[#]]} &
/@ 
        newpts], Frame -> True] // Timing
Out[49]=
{21.912 Second, \[SkeletonIndicator]Graphics\[SkeletonIndicator]}


The function Sin[Pi/2 #]& makes smaller steps in color for the higher
density bins, this is another way to better visualize the core:


In[51]:=
Show[Graphics[{Hue[Min[1.1*First[#], 1]], PointSize[.001], Point[Rest[#]]} &
/@
         newpts], Frame -> True] // Timing
Out[51]=
{10.254 Second, \[SkeletonIndicator]Graphics\[SkeletonIndicator]}

So play around with that.



Of course this is all done better (and more elaborate) by the specialists.
What is to be improved, e.g. is the method to detect vicinities to a point.
A better way might be to build a spatial 2d-datastructure (a quad-tree
perhaps, or something similar) and then color each point with a function of
the number of neigbors (within a given distance, not within a raster, as I
did here), or the inverse to the next nearest neigbor. And of course for
even bigger samples, performance matters more. And then there is the task to
thin out points of the graphics, where this is possible (to improve graphics
performance), but does not sacrifice appearance.

Good luck!

--
Hartmut Wolf


  • Prev by Date: RE: Evaluation rules and HoldRest
  • Next by Date: Re: Usage of tex
  • Previous by thread: Re: Concentration Plot
  • Next by thread: Applying multiple functions to multiple sets of arguements