I am looking to calculate occupancy in emergency department (ED) with tidyverse. Occupancy is understood here in this particular problem as Admitted but did not leave the hospital within the same hour they were admitted. A clearer example would be: if I came at ED at 12:00:00 and did not leave within the hour I was admitted, then I am occupying the bospital. So for this I need to create a new column Occupancy. (A little insight to give - I want to plot occupancy by hour of the day. Yet I know how to plot this, but do not know how to calculate occupancy. Thus no need for you to be bogged down on this issue as I am giving you an insight of my project). What I need though is to learn how to calculate occupancy from the table I have bellow. Please do help.
I have ID, Admission = Adm and Disc = Discharges.
ID = c(101, 102,103, 104, 105, 106, 107)
Adm = as.POSIXct(c("2012-01-12 00:52:00", "2012-01-12 00:55:00", "2012-02-12
01:35:00", "2012-02-12 03:24:00", "2012-02-12 04:24:00",
"2012-02-12 05:24:00", "2012-02-12 05:28:00"))
Disc = as.POSIXct(c("2012-01-12 02:00:00", "2012-01-12 02:59:00", "2012-01-12
03:01:00", "2012-01-12 05:01:00", "2012-01-12 06:01:00",
"2012-01-12 08:01:00", "2012-01-12 08:01:00"))
df = data.frame(ID, Adm, Disc)
I have extracted the hour from the Admission. So that I can use the new column for calculating the occupancy - understood at the problem at hand as Admitted but were not discharged within the hour the patients were admitted. To remind you, I want to do this with tidyverse library
df_hour <- df %>%
mutate(Hour_Adm = lubridate::hour(as.POSIXct(Adm, "%Y%m%d %H:%M:%S")))
Any help is very much appreciated. Thank you.
Logic is to add 1 hour (i.e. 60*60
seconds) to Adm
time (which is of POSIXct
type) and compare it with Disc
time.
First
& last
is added for cases wherein multiple rows are there for an ID
. Then the earliest Adm
and latest Disc
time will only be considered per ID
.
library(tidyverse)
df %>%
group_by(ID) %>%
mutate(occupancy = ifelse(last(Disc) > first(Adm) + 60*60, 1, 0))
which gives
ID Adm Disc occupancy
<dbl> <dttm> <dttm> <dbl>
1 101 2012-01-12 00:52:00 2012-01-12 02:00:00 1.00
2 102 2012-01-12 00:55:00 2012-01-12 02:59:00 1.00
3 103 2012-02-12 01:35:00 2012-01-12 03:01:00 0
4 104 2012-02-12 03:24:00 2012-01-12 05:01:00 0
5 105 2012-02-12 04:24:00 2012-01-12 06:01:00 0
6 106 2012-02-12 05:24:00 2012-01-12 08:01:00 0
7 107 2012-02-12 05:28:00 2012-01-12 08:01:00 0
Sample data:
df <- structure(list(ID = c(101, 102, 103, 104, 105, 106, 107), Adm = structure(c(1326309720,
1326309900, 1328990700, 1328997240, 1329000840, 1329004440, 1329004680
), class = c("POSIXct", "POSIXt"), tzone = ""), Disc = structure(c(1326313800,
1326317340, 1326317460, 1326324660, 1326328260, 1326335460, 1326335460
), class = c("POSIXct", "POSIXt"), tzone = "")), .Names = c("ID",
"Adm", "Disc"), row.names = c(NA, -7L), class = "data.frame")
We can try
library(dplyr)
library(lubridate)
df %>% group_by(ID) %>%
mutate(`Stay In (Hours)` = hour(Disc) - hour(Adm), Occupancy = ifelse(hour(Disc) - hour(Adm) > 1, 1, 0))
%>% ungroup()
#But notice that `hour` consider the hour's part of the time only as shown below, which may lead to misleading results:
hour(as.POSIXct(c("2012-01-12 01:40:00"))) - hour(as.POSIXct(c("2012-01-12 00:50:00")))
[1] 1
The correct answer I hope so:
df %>% group_by(ID) %>%
mutate(`Stay In (Hours)` = round(difftime(Disc, Adm, units='hours'),2),
Occupancy = ifelse(difftime(Disc, Adm, units='hours') > 1, 1, 0)) %>%
ungroup()
# A tibble: 7 x 5
ID Adm Disc `Stay In (Hours)` Occupancy
<dbl> <dttm> <dttm> <time> <dbl>
1 101 2012-01-12 00:52:00 2012-01-12 02:00:00 1.13 1.00
2 102 2012-01-12 00:55:00 2012-01-12 02:59:00 2.07 1.00
3 103 2012-01-12 01:35:00 2012-02-12 03:01:00 745.43 1.00
4 104 2012-01-12 03:24:00 2012-02-12 05:01:00 745.62 1.00
5 105 2012-01-12 04:24:00 2012-02-12 06:01:00 745.62 1.00
6 106 2012-01-12 05:24:00 2012-02-12 08:01:00 746.62 1.00
7 107 2012-01-12 05:28:00 2012-02-12 08:01:00 746.55 1.00