rgeos gIntersection in loop takes too long to clip

2019-06-05 01:50发布

I am using gIntersection to clip a nationwide path network by polygons one at a time from a SpatialPolygonsDataFrame. I am looping through each polygon, clipping the path network, calculating the length of the clipped paths, and saving this to a dataframe called path.lgth:

poly<-readShapePoly("C:\\temp\\polygons.shp")
paths<-readShapeLines("C:\\temp\\paths.shp")


#loop through all polygons clipping lines

path.lgth<-data.frame()

for (i in 1:length(poly)){
  clip<-gIntersection(paths,poly[i,])
  lgth<-gLength(clip)
  vid<-poly@data[i,3]
  data<-cbind(vid,lgth)
  path.lgth<-rbind(path.lgth,data)
  print(i)
}

The vid line just extracts the polygon ID to save in the dataframe with the path length.

My problem is that it takes way too long to do the first polygon (around 12 minutes!). Is there a way to speed this up? I'm not sure what gIntersection does mathematically (is it checking all paths to see if they overlay with the polygon?). I have simplified my paths so they are only one feature.

Thanks for your help.

3条回答
兄弟一词,经得起流年.
2楼-- · 2019-06-05 02:28

If I understood you correctly, you have N polygons, and M paths, right? And for each polygon you want the sum of the paths, right?

Solution 1

Then, first merge all the lines into one feature. Then make intersection at once using byid = TRUE. This way you get rid of the loop:

paths2 <- gLineMerge(paths)
path.crop <- gIntersection(poly, paths2, byid = TRUE)
path.lgth <- gLength(path.crop, byid = TRUE)

You should get the lengths marked by id of the polygons. I am not sure if the ids of the polygons there will be correct - check if they are correct in the path.crop. If not, you need to set the id parameter of gIntersection to the ids of the polygons.

Solution 2

I am not sure if sp::over can be used to make a clever query? This is worth some examining.

查看更多
ゆ 、 Hurt°
3楼-- · 2019-06-05 02:49

since I was facing the same kind of issue, here's my workaround to reduce processing time. (it's a 4 years-old question! I hope that some people -like me- are still facing such problems?)
I'd advise to first select only the line features that are involved in each gIntersection step with the close function 'gIntersects', which returns a logical and which is much faster than gIntersection!

Therefore your code would be like that:

poly<-readShapePoly("C:\\temp\\polygons.shp")
paths<-readShapeLines("C:\\temp\\paths.shp")

#loop through all polygons clipping lines
path.lgth<-data.frame()

for (i in 1:length(poly)){
# FIRST gIntersects to subset the features you need
  LogiSubPaths <- gIntersects(paths, poly[i, )[1,] #The result is a dataframe with one row
  clip <- gIntersection(path[LogiSubPaths,],poly[i, ]) #Subset the features that you need to gIntersect the current polygon 'i'
  lgth <- gLength(clip)
  vid <- poly@data[i,3]
  data <- cbind(vid,lgth)
  path.lgth <- rbind(path.lgth,data)
  print(i)
}`

Worth to give a try on a real dataset to confirm that the output matches your need.

查看更多
啃猪蹄的小仙女
4楼-- · 2019-06-05 02:52

The first things to do are to avoid reallocating memory on each pass thru the loop.
Instead of path.lgth<-rbind(path.lgth,data) , initialize prior to the loop:

path.lgth<-matrix(nrow=length(poly), ncol=2)

Then inside the loop, dump the cbind (gross overkill) and do

path.lgth[i,] <- c(vid,lgth)

Now as to overall execution time - you didn't say anything about what CPU (and RAM) you have available. But check out Rprof to get an idea which steps are taking most of your time.

查看更多
登录 后发表回答