Re: testing if a point is inside a polygon

*To*: mathgroup at smc.vnet.net*Subject*: [mg96475] Re: testing if a point is inside a polygon*From*: Daniel Lichtblau <danl at wolfram.com>*Date*: Sat, 14 Feb 2009 03:10:43 -0500 (EST)*References*: <gmp0mt$btn$1@smc.vnet.net> <gn3bv9$q4n$1@smc.vnet.net>

On Feb 13, 2:45 am, Daniel Lichtblau <d... at wolfram.com> wrote: > On Feb 9, 4:31 am, Mitch Murphy <mi... at lemma.ca> wrote: > > > is there a way to test whether a point is inside a polygon? ie. > > > PointInsidePolygonQ[point_,polygon_] -> True or False > > > i'm trying to do something like ... > > > ListContourPlot[table,RegionFunction->CountryData["Cana= da= > ","Polygon"]] > > > to create what would be called a "clipping mask" in photoshop. > > > cheers, > > Mitch > > The application seems to involve many such tests rather than a one-off > sort of query, so it is probably acceptable to spend time in > preprocessing, if it allows us to avoid O(n) work per query (n being > the number of segments). I'll show an implementation that bins the > segments according to x coordinates. Specifically, a bin will contain > a segment if either starting or terminating x coordinate of the > segment corresponds to x values in the bin, or else the entire bin's > range is between the starting and terminating values. Note that a > given segment might be in multiple bins. So long as we bin "sensibly", > all bins should have but a smallish fraction of the total number of > segments. > > The preprocessing is a bit on the slow side because I do nothing > fancy. I guess one could use Sort and some smarts to make it faster > (which might be a good idea, if the plan is to do this for many > different objects, e.g. all countries). > > Anyway, we create the bins of segments and also keep track of min and > max x and y values (these we use as a cheap way of ruling out points > not in an obvious bounding rectangle). > [...] I missed a couple of useful optimizations. We can use Compile both in preprocessing and in the predicate query itself. polyToSegmentList[poly_, nbins_] := Module[ {xvals, yvals, minx, maxx, miny, maxy, segments, flatsegments, segmentbins, xrange, len, eps}, {xvals, yvals} = Transpose[Flatten[poly, 1]]; {minx, maxx} = {Min[xvals], Max[xvals]}; {miny, maxy} = {Min[yvals], Max[yvals]}; segments = Map[Partition[#, 2, 1, {1, 1}] &, poly]; flatsegments = Flatten[segments, 1]; xrange = maxx - minx; eps = 1/nbins*len; len = xrange/nbins; segmentbins = Table[ getSegsC[j, minx, len, eps, flatsegments] , {j, nbins}]; {{minx, maxx}, {miny, maxy}, segmentbins} ] getSegsC = Compile[ {{j, _Integer}, {minx, _Real}, {len, _Real}, {eps, _Real}, {segs, _Real, 3}}, Module[{lo, hi}, lo = minx + (j - 1)*len - eps; hi = minx + j*len + eps; Select[segs, Module[{xlo, xhi}, {xlo, xhi} = Sort[{#[[1, 1]], #[[2, 1]]}]; lo <= xlo <= hi || lo <= xhi <= hi || (xlo <= lo && xhi >= = hi) ] &]]]; With this we can preprocess the polygons for Canada, using 1000 bins, in around a minute on my machine. In[346]:= canpoly = First[CountryData["Canada", "Polygon"]]; In[347]:= nbins = 1000; Timing[{{xmin, xmax}, {ymin, ymax}, segmentbins} = polyToSegmentList[canpoly, nbins];] Out[337]= {55.3256, Null} To repeat from the last note, there are almost certainly smarter ways to do the preprocessing, so as to make it faster. For most applications I can think of that would be more trouble than it is worth, so it's not something I have attempted. For the predicate evaluation we can do: pointInPolygon[{x_, y_}, bins_, xmin_, xmax_, ymin_, ymax_] := Catch[Module[ {nbins = Length[bins], bin}, If[x < xmin || x > xmax || y < ymin || y > ymax, Throw[False]]; bin = Ceiling[nbins*(x - xmin)/(xmax - xmin)]; If[EvenQ[countIntersectionsC[bins[[bin]], x, y, ymin - 1.]], False, True] ]] countIntersectionsC = Compile[ {{segs, _Real, 3}, {x, _Real}, {yhi, _Real}, {ylo, _Real}}, Module[{tally = 0, yval, xlo, xhi, y1, y2}, Do[ {{xlo, y1}, {xhi, y2}} = segs[[j]]; If[(x < xlo && x < xhi) || (x > xlo && x > xhi), Continue[]]; yval = y1 + (x - xlo)/(xhi - xlo)*(y2 - y1); If[ylo < yval < yhi, tally++]; , {j, Length[segs]}]; tally ]]; With this we can now process around 10,000 points in a second. I selected the region so that most would be inside; this was so that we would not gain speed due to a high percentage failing the basic rectangle test. In[352]:= n = 10000; pts = Transpose[{RandomReal[{-115, -55}, {n}], RandomReal[{45, 75}, {n}]}]; In[354]:= Timing[ inout = Map[pointInPolygon[#, segmentbins, xmin, xmax, ymin, ymax] &, pts];] Out[354]= {1.04284, Null} In[355]:= Take[inout, 20] Out[355]= {True, True, True, True, False, True, True, True, True, \ True, False, True, True, False, False, False, False, True, True, True} I should make a few remarks about the speed/complexity. If we split the n polygon segments into m bins, for m somewhat smaller than n, then for "good" geographies we expect something like O(n/m). Figure most segments appear in no more than two bins, most appear in only one bin, and no bin has more than, say, three times as many segments as any other. To achieve better algorithmic complexity I believe one would need to use something more complicated, along the lines of a triangulation. With such a data structure, efficiently implemented, it might be only O (log(n)) to see whether a new point falls into a triangle, and, if so, whether that triangle is inside or outside the boundary. I doubt such a tactic would be applicable for a polygonal set of the size in question here. If the goal is, say, to color all pixels inside Canada differently than those outside (in effect, to query every pixel in the bounding rectangle), then it would be faster simply to find all boundary points for a given vertical. This transforms from a problem of query on points to one of splitting segments (a query on lines, so to speak). Much more efficient, I would guess. Daniel Lichtblau Wolfram Research