How to check if a point is in a polygon effectivel

2020-04-11 23:44发布

I am new to R and for my currently project, I have to draw a heat map related to a specific event. There are around 2 million observations of such event and in each observation there is a long and lat coordinate. Also, I have converted the map data to a data frame and the data frame contains 71 district, each district is defined with a set of coordinates. I need to decide which observation of the event belongs to which district. I am using the following code:

for (row in 1:nrow(data2015)){
  point.x=data2015[row,"Latitude"]
  point.y=data2015[row,"Longitude"]
  for (name in names(polygonOfdis)){
    if (point.in.polygon(point.x, point.y, polygonOfdis[[name]]$lat,   polygonOfdis[[name]]$long, mode.checked=FALSE)){
    count[[name]]<-count[[name]]+1
    break
}
}
}

data2015 is the data set for the event, polygonOfdis is the data set for each district.

For small data set, this algorithm works okay but for my data set, it will definitely run more than ten hours or even more (For a data set only 1/400 of current size, this algorithm runs for 1 to 2 minutes). I am wondering if there is any better way to find out which observation belongs to which district? My problem is that the point.in.polygon function takes too much time and I am wondering if there is any other function can do this?

PS: The current data is actual only 1/10 of the real data I have to process, so I really really need a faster way to do this.

5条回答
Melony?
2楼-- · 2020-04-12 00:24

This function from the SMDTools package worked well.

查看更多
我命由我不由天
3楼-- · 2020-04-12 00:24

Based on @conner-m suggestion:

library(tidyverse)
library(furrr)
library(SMDTools)

plan(multiprocess)
future_map2_dfr(
  polygonOfdis,
  names(polygonOfdis),
  ~tibble(
    district = .y,
    pip = 
      pnt.in.poly(
        data2015[, c('Latitude', 'Longitude')], 
        .x
      )$pip
  )
) %>% 
  group_by(district) %>% 
  summarise(count = sum(pip))
查看更多
该账号已被封号
4楼-- · 2020-04-12 00:34

Your code is pretty straight forward, your stumbling block is using loops instead of the R's vectorization power. This code should work, but without any data I can't verify it:

# create a column onto the dataframe to store the results 
data2015$poly<-"blank"
point.x=data2015$Latitude
point.y=data2015$Longitude
for (name in names(polygonOfdis)){
    #point.in.polygon returns a arrary of 0 to 3 for point location
    inpoly<-point.in.polygon(point.x, point.y, polygonOfdis[[name]]$lat,
                         polygonOfdis[[name]]$long, mode.checked=FALSE)
    #if the element in >0 in poly assign poly name to poly column 
    data2015$poly[inpoly>0]<-name
  }
  #additional processing (returns count per polygon)
  tapply(data2015$poly, INDEX = data2015$poly, FUN=length)

This code also assumes that each point is in one and only 1 polygon. The inner loop and the tapply could most likely improved by using the dplyr library. The other listed solution with the PIP Algorithm could provide a boost over the built-in method.

查看更多
Animai°情兽
5楼-- · 2020-04-12 00:38

So, awhile ago, I ported over a point in a polygon algorithm by W. Randolph Franklin that uses the notion of rays. I.e. If a point is in the polygon, it should pass through an odd number of times. Otherwise, when it has an even number, it should lie on the outside of the polygon.

The code is considerably fast because it is written using Rcpp. It is split into two parts: 1. The PIP Algorithm and 2. A wrapper function for classification.

PIP Algorithm

#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppArmadillo)]]

//' @param points A \code{rowvec} with x,y  coordinate structure.
//' @param bp     A \code{matrix} containing the boundary points of the polygon. 
//' @return A \code{bool} indicating whether the point is in the polygon (TRUE) or not (FALSE)
// [[Rcpp::export]]
bool pnpoly(const arma::rowvec& point, const arma::mat& bp) {
    // Implementation of the ray-casting algorithm is based on
    // 
    unsigned int i, j;

    double x = point(0), y = point(1);

    bool inside = false;
    for (i = 0, j = bp.n_rows - 1; i < bp.n_rows; j = i++) {
      double xi = bp(i,0), yi = bp(i,1);
      double xj = bp(j,0), yj = bp(j,1);

      // See if point is inside polygon
      inside ^= (((yi >= y) != (yj >= y)) && (x <= (xj - xi) * (y - yi) / (yj - yi) + xi));
    }

    // Is the cat alive or dead?
    return inside;
}

Classification Algorithm

//' PIP Classifier
//' @param points A \code{matrix} with x,y  coordinate structure.
//' @param names  A \code{vector} of type \code{string} that contains the location name.
//' @param bps    A \code{field} of type {matrix} that contains the polygon coordinates to test against.
//' @return A \code{vector} of type \code{string} with location information.
// [[Rcpp::export]]
std::vector<std::string> classify_points(const arma::mat& points, 
                                         std::vector<std::string> names,
                                         const arma::field<arma::mat>& bps){
  unsigned int i, j;

  unsigned int num_points = points.n_rows;

  std::vector<std::string> classified(num_points);

  for(i = 0; i < num_points; i++){

    arma::rowvec active_row = points.row(i);

    // One of the coordinate lacks a value
    if( !arma::is_finite(active_row(0)) || !arma::is_finite(active_row(1)) ){
      classified[i] = "Missing";
      continue; // skip trying to find a location
    }

    // Try to classify coordinate based on supplied boundary points for area j
    for(j = 0; j < names.size(); j++){
      if( pnpoly(active_row, bps(j)) ){
        classified[i] = names[j];
        break; // Break loop
      }
    }

  }

  return classified;
}
查看更多
forever°为你锁心
6楼-- · 2020-04-12 00:43

There's a package for that, namely ptinpoly.

library(ptinpoly)
# define a square 
square <- rbind(
  c(0,0),
  c(0,1),
  c(1,0),
  c(1,1)
)

pinside <- rbind(c(0.5,0.5)) # point inside the square
poutside <- rbind(c(2,1)) # point outside the square

Note that you can test several points (see below), but if you test a single one you need a matrix, that's why I use rbind.

You get 0 if the point is inside the polygon, -1 otherwise:

> pip2d(square, pinside)
[1] 0
> pip2d(square, poutside)
[1] -1

As I said before you can simultaneously test multiple points:

> pip2d(square, rbind(pinside, poutside))
[1]  0 -1

The package also allows to test for point containment in a 3D polyhedron.

查看更多
登录 后发表回答