How to attach a simple data.frame to a SpatialPoly

2019-01-16 10:00发布

问题:

I have (again) a problem with combining data frames in R. But this time, one is a SpatialPolygonDataFrame (SPDF) and the other one is usual data.frame (DF). The SPDF has around 1000 rows the DF only 400. Both have a common column, QDGC

Now, I tried

oo <- merge(SPDF,DF, by="QDGC", all=T)

but this only results in a normal data.frame, not a spatial polygon data frame any more. I read somewhere else, that this does not work, but I did not understand what to do in such a case (has to do something with the ID columns, merge uses)

oooh such a hard question, I quess...

Thanks! Jens

回答1:

Let df = data frame, sp = spatial polygon object and by = name or column number of common column. You can then merge the data frame into the sp object using the following line of code

sp@data = data.frame(sp@data, df[match(sp@data[,by], df[,by]),])

Here is how the code works. The match function inside aligns the columns so that order is preserved. So when we merge it with sp@data, order is correctly preserved. A quick check to see if the code has worked is to inspect the two columns corresponding to the common column and see if they are identical (the common columns get duplicated and it is easy to remove the copy, but i keep it as it is a good check)



回答2:

It is as easy as this:

require(sp) # the trick is that this package must be loaded!

oo <- merge(SPDF,DF, by="QDGC")

I've tested by myself. But it only works if you use merge from package sp. This is the default when sp package is loaded. merge function is then overloaded and sp::merge is used if the first argument is spatial structure.



回答3:

merge can produce a dataframe with more rows than the originals if there's not a simple 1-1 mapping of the two dataframes. In which case, it would have to copy all the geometry and create multiple polygons, which is probably not a good thing.

If you have a dataframe which is the same number of rows as a SpatialPointsDataFrame, then you can just directly replace the @data slot.

library(sp)
example(overlay) # to get the srdf object
srdf@data
spplot(srdf)
srdf@data=data.frame(x=runif(3),xx=rep(0,3))
spplot(srdf)

if you get the number of rows wrong:

srdf@data=data.frame(x=runif(2),xx=rep(0,2))
spplot(srdf)
Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 3, 2


回答4:

Maybe the function joinCountryData2Map in the rworldmap package can give inspiration. (But I may be wrong, as I was last time.)



回答5:

One more solution is to use append_data function from the tmaptools package. It is called with these arguments:

append_data(shp, data, key.shp = NULL, key.data = NULL,
  ignore.duplicates = FALSE, ignore.na = FALSE,
  fixed.order = is.null(key.data) && is.null(key.shp))

It's a bit unfortunate that it's called append since I'd understand append more ina sense of rbind and we want to have something like join or merge here.

Ignoring that fact, function is really useful in making sure you got your joins correct and if some rows are present only on one side of join. From the docs:

Under coverage (shape items that do not correspond to data records), over coverage (data records that do not correspond to shape items respectively) as well as the existence of duplicated key values are automatically checked and reported via console messages. With under_coverage and over_coverage the under and over coverage key values from the last append_data call can be retrieved,