R: Calculating overlap polygon area

2019-08-12 15:55发布

问题:

I am having some difficulties in R trying to calculate the area of overlap between polygons. As an example, for Switzerland I have data on the administrative boundaries of the cantons and the ethnic groups in the country.

## Libraries
library(sp)
library(spdep)
library(maptools)
library(rgeos)

## Data
print(load(url("http://gadm.org/data/rda/CHE_adm1.RData")))
greg<-readShapeSpatial("raw_data/GREG.shp",
                     proj4string=CRS(" +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0")) 
switzerland<-greg[greg$COW==225,]

## Identical projections
proj<-CRS(" +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0")
switzerland<-spTransform(switzerland,proj)

Plotting the data and we get something that looks like this:

## Plot data
par(mar=c(1,1,1,1))
plot(gadm,col="grey80")
plot(switzerland,add=TRUE,lwd=2,border="red")

We can see that the boundaries for the ethnic groups don't follow the national borders entirely, but good enough. What I am trying to do is for each of the cantons calculate the number of ethnic groups, taking into account the area of the ethnic group within the canton. So for Graubünden in the East I want to know the area occupied by German Swiss, Italian Swiss, Rhaetoromaniens, etc..

Reading some of the similar questions here on stackoverflow I thought that gIntersection is the command to use, but this gives me the following error:

int<-gIntersection(gadm,switzerland,byid=TRUE) # Updated


Error in RGEOSBinTopoFunc(spgeom1, spgeom2, byid, id, drop_lower_td, "rgeos_intersection") : 
TopologyException: no outgoing dirEdge found at 7.3306802801147688 47.439399101791921
In addition: Warning message:
In RGEOSBinTopoFunc(spgeom1, spgeom2, byid, id, drop_lower_td, "rgeos_intersection") :
spgeom1 and spgeom2 have different proj4 strings

Not entirely sure what causes the second warning mission though since

identicalCRS(gadm,switzerland)
[1] TRUE

Does anyone have a suggestion on how I could calculate the overlap between the cantons and the ethnic groups?

Update: Possible solution

This might be a possible solution, although the warning on different proj4 strings persists. Also note that there is some measurement error (in Aargua for instance) due to the fact that the ethnic groups shape file does not follow the national borders correctly.

## Possible solution
int<-gIntersection(gadm,switzerland,byid=T) # Multiple polygons per province
n<-names(int)
n<-data.frame(t(data.frame(strsplit(n," ",fixed=TRUE)))) 

colnames(n)[1:2]<-c("id0","ethnic.group")
n$area<-sapply(int@polygons, function(x) x@area)
a<-data.frame(id0=row.names(gadm),total.area=gadm$Shape_Area)
df<-merge(n,a,all.x=TRUE)

df$share.area<-df$area/df$total.area*100

回答1:

Here is a method slightly different that yours (but only slightly).

Inspection of switzerland@data reveals that, while there are 11 FeatureIDs (representing ethnicitity's), there are only 4 unique named ethnicity's (German, Italian, and French Swiss, and Rhaetoromanians). So the result below is based on the names, not the IDs.

library(rgeos)    # for gIntersection(...), etc.
library(rgdal)    # for readOGR(...)

setwd("<directory to accept your files>")
CH.1903 <- "+proj=somerc +lat_0=46.95240555555556 +lon_0=7.439583333333333 +k_0=1 +x_0=600000 +y_0=200000 +ellps=bessel +towgs84=674.374,15.056,405.346,0,0,0,0 +units=m +no_defs"

print(load(url("http://gadm.org/data/rda/CHE_adm1.RData")))
gadm <- spTransform(gadm, CRS(CH.1903))
download.file("http://www.icr.ethz.ch/data/other/greg/GREG.zip","GREG.zip")
unzip("GREG.zip")
greg <- readOGR(dsn=".",layer="GREG")
greg <- spTransform(greg[greg$COW==225,],CRS(CH.1903))

gadm.ids <- gadm@data$ID_1               # Canton Nr.
greg.ids <- unique(greg@data$G1SHORTNAM) # ethnicity
get.area <- Vectorize(function(adm,reg){
  int <- gIntersection(gadm[gadm$ID_1==adm,],greg[greg$G1SHORTNAM==reg,],byid=TRUE)
  if (length(int)==0) return(0)
  gArea(int)
})
result <- outer(gadm.ids,greg.ids,FUN=get.area)
rownames(result) <- gadm.ids
colnames(result)  <- greg.ids
result <- as.data.frame(result)
totals <- rowSums(result)
result <- result/totals
result$totals <- totals/1e6
result$land.area <- sapply(rownames(result),function(p)gArea(gadm[gadm$ID_1==p,])/1e6)
result
#     German Swiss French Swiss Italian Swiss Rhaetoromanians     totals  land.area
# 531  1.000000000   0.00000000    0.00000000    0.000000e+00 1363.27027 1403.22192
# 532  1.000000000   0.00000000    0.00000000    0.000000e+00  244.32279  244.32279
# 533  1.000000000   0.00000000    0.00000000    0.000000e+00  172.40341  172.40341
# 534  1.000000000   0.00000000    0.00000000    0.000000e+00  522.12943  525.73449
# 535  1.000000000   0.00000000    0.00000000    0.000000e+00   70.03116   84.06481
# 536  0.902128658   0.09787134    0.00000000    0.000000e+00 5927.90740 5927.90847
# 537  0.188415744   0.81158426    0.00000000    0.000000e+00 1654.28729 1654.28729

Here we transform both shapefiles to CH.1903 which is a Mercator projection centered on Switzerland with units in meters. Then we identify the Canton Nrs. and the ethnicity's, and use outer(...) to cycle through both lists, calculating the area of intersection in sq.km (sq.m/1e6) using gArea(...) . The final result has one row per Canton, with the percentage of each ethnicity based on land area. $totals is the summation of the intersected areas for each Canton, and $land.area is the total geographic land area in the Canton. So this gives you an idea of the error due to incomplete overlaps between the ethnicity shapfile and the gadm shapefile.