As a follow up on a question I posted yesterday, is there package/function within R that can give me the country (and continent) in which a point defined by longitude and latitude is located? Something along the lines of what is done here in MatLab.
Dataframe looks like this...
Point_Name Longitude Latitude
University of Arkansas 36.067832 -94.173655
Lehigh University 40.601458 -75.360063
Harvard University 42.379393 -71.115897
And I would like to output the above dataframe with country and continent columns added to it. As an added bonus, a column with US states for those in the US (and "other" for those outside the US)?
To get continents you can modify the last line of the coords2country
function using rworldmap
from this answer to create a coords2continent
function as shown below. Choose whether you want the 6 or 7 continent model. I'll think about putting this code into rworldmap
.
library(sp)
library(rworldmap)
# The single argument to this function, points, is a data.frame in which:
# - column 1 contains the longitude in degrees
# - column 2 contains the latitude in degrees
coords2continent = function(points)
{
countriesSP <- getMap(resolution='low')
#countriesSP <- getMap(resolution='high') #you could use high res map from rworldxtra if you were concerned about detail
# converting points to a SpatialPoints object
# setting CRS directly to that from rworldmap
pointsSP = SpatialPoints(points, proj4string=CRS(proj4string(countriesSP)))
# use 'over' to get indices of the Polygons object containing each point
indices = over(pointsSP, countriesSP)
#indices$continent # returns the continent (6 continent model)
indices$REGION # returns the continent (7 continent model)
#indices$ADMIN #returns country name
#indices$ISO3 # returns the ISO3 code
}
Here is a test.
points = data.frame(lon=c(0, 90, -45, -100, 130), lat=c(52, 40, -10, 45, -30 ))
coords2continent(points)
#[1] Europe Asia South America North America Australia
coords2country(points)
#[1] United Kingdom China Brazil United States of America Australia
So this is an alternative that uses the reverse geocoding API on Google. This code is based in part on this reference.
Calling your dataframe above df
,
reverseGeoCode <- function(latlng) {
require("XML")
require("httr")
latlng <- as.numeric(latlng)
latlngStr <- gsub(' ','%20', paste(round(latlng,2), collapse=","))
url <- "http://maps.google.com"
path <- "/maps/api/geocode/xml"
query <- list(sensor="false",latlng=latlngStr)
response <- GET(url, path=path, query=query)
if (response$status !=200) {
print(paste("HTTP Error:",response$status),quote=F)
return(c(NA,NA))
}
xml <- xmlInternalTreeParse(content(response,type="text"))
status <- xmlValue(getNodeSet(xml,"//status")[[1]])
if (status != "OK"){
print(paste("Query Failed:",status),quote=F)
return(c(NA,NA))
}
xPath <- '//result[1]/address_component[type="country"]/long_name[1]'
country <- xmlValue(getNodeSet(xml,xPath)[[1]])
xPath <- '//result[1]/address_component[type="administrative_area_level_1"]/long_name[1]'
state <- xmlValue(getNodeSet(xml,xPath)[[1]])
return(c(state=state,country=country))
}
st.cntry <- t(apply(df,1,function(x)reverseGeoCode(x[2:3])))
result <- cbind(df,st.cntry)
result
# Point_Name Longitude Latitude state country
# 1 University of Arkansas 36.06783 -94.17365 Arkansas United States
# 2 Lehigh University 40.60146 -75.36006 Pennsylvania United States
# 3 Harvard University 42.37939 -71.11590 Massachusetts United States
In the API definition, "administrative_area_level_1" is the highest administrative area below country. In the US these are states. In other countries the definition varies (might be provinces, for example).
Incidentally, I'm fairly certain you have latitude and longitude reversed.