I am using R to filter Tropical cyclone tracks passing over a grid box. I have a csv file containing the tracks and convert them to a shapefile.
I wanted to filter only the points with the same identifier (the "SN" column in the sample data below) that passed over a specified grid box (5N to 25N and 115E to 135 E). Below is the code that I am using and the link to the data.
jtwc <- read.csv("1979-1993_TC.csv",header=T,sep=",")
latmin <-5.00
latmax <- 25.00
lonmin <- 115.00
lonmax <- 135.00
jtwc.unique <- unique(jtwc[jtwc$Lat >= latmin & jtwc$Lat <= latmax & jtwc$Lon >= lonmin & jtwc$Lon <= lonmax,c(1,2)])
jtwc.filter <- merge(jtwc,jtwc.unique,all.x = F,all.y = T, sort = F)
jtwc.filter$Lon <- ifelse(jtwc.filter$Lon < 0, jtwc.filter$Lon + 360, jtwc.filter$Lon)
jtwc.filter <- jtwc.filter[with(jtwc.filter,order(Year,Month,Day,Hour,CY)),]
write.table(jtwc.filter,file = "test2_jul_par_1979-1993.csv", sep = ",", row.names = F)
Problem:
This code does not work properly. When I ran the script, I still see tracks outside the box.
Can anyone suggest any way to improve this?
I'll appreciate any help.
Here's an approach that uses data.table
to do the data manipulation, then uses googleway
to plot the tracks on google maps
library(googleway)
library(data.table) ## because I like working with data.table to do data manipulation
jtwc <- read.csv("~/Downloads/1979-1993_TC.csv")
setDT(jtwc) ## set as data.table
latmin <-5.00
latmax <- 25.00
lonmin <- 115.00
lonmax <- 135.00
df_bounds <- data.frame(north = latmax, south = latmin, west = lonmin, east = lonmax)
## apply a logical column whether the point is in the box
jtwc[, inBounds := Lat >= latmin & Lat <= latmax & Lon >= lonmin & Lon <= lonmax]
## create a column that identifies if the SN at some point passes through the box
jtwc[SN %in% jtwc[inBounds == TRUE, unique(SN)], passesThroughBox := T ]
jtwc[is.na(passesThroughBox), passesThroughBox := F]
## adding a colour for plotting
jtwc[, colour := ifelse(passesThroughBox, "#4286F4", "#F44141") ]
## you need a google maps API key to plot on Google Maps
map_key <- 'your_api_key'
google_map(key = map_key) %>%
add_polylines(data = jtwc, lat = "Lat", lon = "Lon", id = "SN", stroke_colour = 'colour',
mouse_over_group = 'passesThroughBox') %>%
add_rectangles(data = df_bounds, north = 'north', south = 'south', west = 'west', east = 'east',
fill_opacity = 0.1)
Then when hovering over the lines, you can see the ones that pass through
To filter you can use
library(dplyr)
dat %>%
filter(Lat>=5 & Lat <=25 & Lon>=115 & Lon<135)
conversely if you want to maintain the original data frame you could use
dat %>%
mutate(boxed = ifelse(Lat>=5 & Lat <=25 & Lon>=115 & Lon<135, 1,0))
If you want to plot the tracks
library(ggplot2)
dat %>%
mutate(boxed = ifelse(Lat>=5 & Lat <=25 & Lon>=115 & Lon<135, 1,0)) %>%
ggplot(aes(Lon,Lat, color=factor(boxed)))+geom_point()
This code corrects the base issue, but I don't know if it fully solves your problem. Your original code seems to assume that the combination of CY and SN are unique in the dataset, which I believe is untrue. There must be combinations with different measurements for the same pair. This version saves the bounded
values and then merges against this bounded
table
library(assertthat)
jtwc <- read.csv("~/Downloads/1979-1993_TC.csv", header=T, sep=",")
latmin <-5.00
latmax <- 25.00
lonmin <- 115.00
lonmax <- 135.00
# adjust for negative lat
jtwc$Lon <- ifelse(jtwc$Lon < 0, jtwc$Lon + 360, jtwc$Lon)
# derive the bounded points
jtwc.bounded <- jtwc[jtwc$Lat >= latmin & jtwc$Lat <= latmax & jtwc$Lon >= lonmin & jtwc$Lon <= lonmax,]
# all these are TRUE
assert_that (all(jtwc.bounded$Lat >= latmin))
assert_that (all(jtwc.bounded$Lat <= latmax))
assert_that (all(jtwc.bounded$Lon >= lonmin))
assert_that (all(jtwc.bounded$Lon <= lonmax))
jtwc.unique <- unique(jtwc.bounded[,c(1,2)])
# merge with bounded (
jtwc.filter <- merge(jtwc.bounded, jtwc.unique, all.x = F, all.y = T, sort = F)
assert_that (all(jtwc.filter$Lat >= latmin))
assert_that (all(jtwc.filter$Lat <= latmax))
assert_that (all(jtwc.filter$Lon >= lonmin))
assert_that (all(jtwc.filter$Lon <= lonmax))
jtwc.filter <- jtwc.filter[with(jtwc.filter,order(Year,Month,Day,Hour,CY)),]
write.table(jtwc.filter,file = "test2_jul_par_1979-1993.csv", sep = ",", row.names = F)