How to plot large time series (thousands of admini

2019-07-13 15:26发布

问题:

I'm trying to plot how a single drug has been prescribed in the hospital. In this dummy database I have 1000 patient encounters after 2017/01/01.

The goal of plotting is to see the pattern of administration of this drug: Is it given more frequently / high dose closer to time of admission, discharge, or in the middle of patient stay.

#Get_random_dates that we will use multiple times
gen_random_dates <- function(N, st, et) {
st <- as.POSIXct(as.Date(st))
et <- as.POSIXct(as.Date(et))
dt <- as.numeric(difftime(et,st,unit="sec"))
ev <- runif(N, 0, dt)
rt <- st + ev
return(rt)
}

#Generate admission and discharge dates
admission <- gen_random_dates(1000, "2017/01/01", "2017/01/10")
discharge <- gen_random_dates(1000, "2017/01/11", "2017/01/20")
patient <- sort(sample(1:1000, 1000))
patient_data <- data.frame(patient_ID = patient, admission_date = admission, discharge_date = discharge)

#Grow the database
patient_data <- patient_data[sort(sample(1000, 100000, replace=TRUE)), ] 

#Medication admin date and dose
patient_data$admin_date <- gen_random_dates(100000, patient_data$admission_date, patient_data$discharge_date)
patient_data$admin_dose <- abs(as.integer(rnorm(100000, 50, 100)))

I tried this ggplot function but it did not help me visualize the pattern.

ggplot(patient_data, aes(x = admin_date, y = admin_dose)) +
  xlab("Use of Drug in Patient Encounters") + ylab("Dose (mg)") +
  geom_jitter()

ggplot

回答1:

If a browser is an acceptable target, one option is to try ggplotly which enables panning/zooming, helpful with a time series with a lot of data. (Disclaimer, I'm a plotly.js maintainer.) Besides this, there's a regular R API to plotly.js. Plotly has plots that can visualize a lot of points or lines, not just due to zoom/pan but also, in some plot types, the backing of WebGL, which can be much faster.



回答2:

I would suggest using facets to view a handful of patients at once. This doesn't scale well for thousands of patients, but it could help you look at 10-20 at a time. ggplotly works pretty well with facets, too.

ggplot(patient_data, aes(x = admin_date, y = admin_dose)) +
  xlab("Use of Drug in Patient Encounters") + ylab("Dose (mg)") +
  geom_jitter() +
  facet_wrap(~patient)