I have a data frame consisting of multiple data points with specific geocoordinates (latitude and longitude). I'm looking to create a choropleth-style world map where geographical regions are shaded according to how many data points fall within the boundaries of the region.
Is there a simple way to accomplish what I'm trying to do in R, preferably using the "maps" package's world map and the "ggplot2" map-plotting functions?
Here is a minimally reproducible result of what I have:
library(ggplot2)
library(maps)
data <- data.frame(lat = 40.730610, lon = -73.935242)
ggplot() +
geom_polygon(data = map_data("world"), aes(x = long, y = lat, group = group, fill = group)) +
coord_fixed(1.3)
I've noticed that the fill
parameter on plot item functions can be used to create a choropleth effect. Here, the fill
parameter on the aes()
function of the geom_polygon()
function is used to create a choropleth where each group is color coded differently.
There are many ways to achieve this task. The general idea is to convert both the point data and polygon data to spatial objects. After that, count how many points fall within that polygon. I know we can do this using the
sp
package, which is widespread and well-known in the R community, but I decided to use thesf
package becausesf
would be the next generation standard of spatial objects in R (https://cran.r-project.org/web/packages/sf/index.html). Knowing the usage and functionality ofsf
will probably be beneficial.First, the OP provided an example point, but I decided to add more points so that we can see how to count the points and aggregate the data. To do so, I used the
ggmap
pakcage to geocode some cities that I selected as an example.Next, I converted the
point_data3
data frame to thesf
object. I will also get the polygon data of the world using themaps
package and convert it to ansf
object.Now both
point_sf
andworld_sf
aresf
objects. We can use thest_within
function to examine which points are within which polygons.The total count information is in the
Count
column ofworld_sf
. We can get the world data frame as the OP did using themap_data
function. We can then mergeworld_data
andworld_df
.Now we are ready to plot the data. The following code is the same as the OP's ggplot code except that the input data is now
world_data2
andfill = Count
.