My goal is generating a Los Angles zipcode map filled by another variable, say number of crime events using ggplot. As the filling variable that is originally not stored in the sp object, the method suggested in here may not be used directly.
First, I have a huge dataset called ztca which is a SpatialPolygonDataFrame and contains all polygons (one of the layers called ZCTA5CE10, contains the zip-codes that I work on) of USA from the US Census Bureau, the preparation of primary sp object is mainly based on this link.
library(sp)
library(maptools)
library(zipcode)
data(zipcode) # load the zipcode as a data frame
library(ggmap)
# grab the zip code boundaries based on 2017 US Census Bureau
url <- "http://www2.census.gov/geo/tiger/GENZ2017/shp/cb_2017_us_zcta510_500k.zip"
fil <- "ztca.zip"
# If we have the file in the working directory, don't need to download again
if (!file.exists(fil)) { download.file(url, fil) }
unzip(fil, exdir="ztca")
# read them in (this takes a bit)
ztca <- readShapePoly("ztca/cb_2017_us_zcta510_500k.shp", verbose=TRUE)
class(ztca)
Then, I select some zipcodes under interest by
zip.interest <- c(91306, 90008, 90014, 90094, 91326, 90212, 90001, 90006, 90301, 91401, 90095, 90021, 90089, 90057, 90029, 90502, 90036, 90745, 91330, 91411, 91601, 91607, 90302, 90073, 91340, 91602, 90071, 90039, 90011, 90248, 91303, 90020, 90732, 90405, 91342, 90077, 90061, 90015, 90002, 90041, 90211, 90059, 90066, 90210, 91343, 90062, 91606, 91403, 90063, 90012, 91307, 90293, 91402, 91364, 90007, 91331, 90005, 91040, 90031, 91324, 90043, 91316, 91352, 90069, 91325, 91344, 90019, 90056, 90032, 90291, 90027, 90247, 90232, 90013, 90042, 90035, 90024, 90026, 90034, 90016, 90018, 90004, 90067, 90010, 90025, 90047, 90037, 91605, 90049, 90744, 90028, 90272, 90275, 90731, 90710, 91042, 91345, 90003, 91335, 91304, 91311, 91214, 91604, 91608, 91405, 91367, 91406, 91436, 91504, 91423, 91505, 91356, 90292, 90402, 90065, 90501, 90230, 90048, 90046, 90038, 90044, 90810, 90023, 90068, 90045, 90717, 90064, 90058, 90033, 90017, 90090)
LA <- ztca[as.character(ztca$ZCTA5CE10) %in% as.character(zip.interest),]
class(LA)
# My data (want to be used as filling variable)
data <- matrix(c(90001, 90002, 90003, 90004, 90005, 90006, 90007, 90008, 90010, 90011, 90012, 90013, 90014, 90015, 90016, 90017, 90018, 90019, 90020, 90021, 90023, 90024, 90025, 90026, 90027, 90028, 90029, 90031, 90032, 90033, 90034, 90035, 90036, 90037, 90038, 90039, 90041, 90042, 90043, 90044, 90045, 90046, 90047, 90048, 90049, 90056, 90057, 90058, 90059, 90061, 90062, 90063, 90064, 90065, 90066, 90067, 90068, 90069, 90071, 90073, 90077, 90089, 90090, 90094, 90095, 90210, 90211, 90212, 90230, 90232, 90247, 90248, 90272, 90275, 90291, 90292, 90293, 90301, 90302, 90402, 90405, 90501, 90502, 90710, 90717, 90731, 90732, 90744, 90745, 90810, 91040, 91042, 91214, 91303, 91304, 91306, 91307, 91311, 91316, 91324, 91325, 91326, 91330, 91331, 91335, 91340, 91342, 91343, 91344, 91345, 91352, 91356, 91364, 91367, 91401, 91402, 91403, 91405, 91406, 91411, 91423, 91436, 91504, 91505, 91601, 91602, 91604, 91605, 91606, 91607, 91608, 7086, 20731, 45814, 24883, 15295, 23099, 22163, 24461, 6351, 40047, 16028, 22822, 9500, 22859, 23201, 19959, 21000, 25806, 12917, 11767, 12791, 9221, 14967, 26850, 21314, 36956, 14556, 12233, 12986, 19864, 17239, 10843, 17399, 35476, 13786, 10083, 9214, 18613, 22267, 46012, 26956, 14627, 24949, 10958, 10238, 294, 21961, 3037, 15441, 12432, 20470, 3673, 12165, 14869, 14516, 2299, 10013, 2458, 1862, 473, 1755, 3772, 691, 1418, 200, 1689, 813, 181, 3553, 912, 3807, 2693, 5017, 558, 18841, 5872, 3890, 573, 322, 592, 331, 5795, 1119, 7323, 667, 29879, 5318, 22950, 110, 144, 5919, 7465, 135, 17871, 14904, 16814, 6811, 14521, 9806, 17144, 9857, 7175, 647, 32723, 25985, 3182, 29040, 23947, 18500, 7194, 19031, 12226, 11075, 16189, 17455, 26382, 9818, 22774, 22124, 12000, 14122, 4965, 566, 522, 20601, 7204, 14051, 24116, 18162, 11174, 478),ncol=2)
colnames(data) <- c("ZCTA5CE10", "cnt" )
data <- as.data.frame(data)
LA@data <- cbind(LA@data, cnt=data$cnt)
# Suppose plot by
ggplot(data = LA, aes(x = long, y = lat, fill = cnt, group = group)) +
geom_polygon(colour = "black") +
coord_equal() + coord_fixed(1.3)
But I got "Regions defined for each Polygons Error: Aesthetics must be either length 1 or the same as the data (6877): x, y, fill, group". I guess it might be something wrong when I select those polygons and/or adding cnt.
Thanks for your help and comment!!!
You may need to install
ggplot2
from GitHub for this. Probably not, but I'm always on the dev version of it and can't really downgrade for this example.Switch to
sf
:It should have read that in much faster.
Now we need your data. First the zips:
The
ZCTA5CE10
field is the zip (IIRC) so we filter those out:And, then we make your zip-fill data:
Now, it's just a matter of joining that data to the
sf
object and plotting it: