How to create a new “time difference” column from

2020-03-27 04:53发布

I have been given a table of data to analyse. I have a column with start times and a column with finish times given in 3 or 4 digits, example: 3:40pm is 1540, 7:25am is 725.

How can I obtain a new column in my table with the difference of these times, given in minutes? There are over 2000 entries.

Thanks for any help.

structure(list(schedtime = c(1455L, 1640L, 1245L, 1715L, 1039L, 
840L), deptime = c(1455L, 1640L, 1245L, 1709L, 1035L, 839L), 
distance = c(184L, 213L, 229L, 229L, 229L, 228L), flightnumber = 
c(5935L, 
6155L, 7208L, 7215L, 7792L, 7800L), weather = c(0L, 0L, 0L, 
0L, 0L, 0L), dayweek = c(4L, 4L, 4L, 4L, 4L, 4L), daymonth = c(1L, 
1L, 1L, 1L, 1L, 1L)), row.names = c(NA, 6L), class = "data.frame")

This is an example of the dataset (Im unsure are to why there are L's after each number, these are not shown in the table). I want, in minutes, deptime (finish) - schedtime (start).

Given the new column, there are 2 values with a schedtime before midnight and a deptime after midnight, for example schedtime 2120 and deptime 0010. The answer to this is given as -1270, considering it to be an extremely early departure. How could I change this to be calculated as +170, a late departure?

标签: r time
4条回答
一纸荒年 Trace。
2楼-- · 2020-03-27 05:11

Efficient way for larger dataset-

data.table::setDT(dt)[,time_diff:=minutes(deptime-schedtime)]

> dt
   schedtime deptime distance flightnumber weather dayweek daymonth time_diff
1:      1455    1455      184         5935       0       4        1        0S
2:      1640    1640      213         6155       0       4        1        0S
3:      1245    1245      229         7208       0       4        1        0S
4:      1715    1709      229         7215       0       4        1    -6M 0S
5:      1039    1035      229         7792       0       4        1    -4M 0S
6:       840     839      228         7800       0       4        1    -1M 0S

EDIT- (To handle cases like 1730 - 1600 = 130 mins ( Actually, it is 90 mins).

library(data.table)
library(stringr)
setDT(dt)
dt[,schedtime:=str_pad(schedtime, 4, pad = "0")]
dt[,deptime:=str_pad(deptime, 4, pad = "0")]

dt[,time_diff:=difftime(as.ITime(strptime(x = schedtime, format = "%H%M")),as.ITime(strptime(x = deptime, format = "%H%M")),units = "mins")]


> dt
   schedtime deptime distance flightnumber weather dayweek daymonth time_diff
1:      1455    1455      184         5935       0       4        1    0 mins
2:      1640    1640      213         6155       0       4        1    0 mins
3:      1245    1245      229         7208       0       4        1    0 mins
4:      1715    1709      229         7215       0       4        1    6 mins
5:      1039    1035      229         7792       0       4        1    4 mins
6:      1730    1600      228         7800       0       4        1   90 mins
查看更多
▲ chillily
3楼-- · 2020-03-27 05:16
dat <- data.frame(c(1540,1820,1330,545,100),c(1850,2150,2325,1330,101))
60*(floor(dat[,2]/100) - floor(dat[,1]/100)) - dat[,1] %% 100 + dat[,2] %% 100

Taking the floor of the the hundreds gives the hours. Taking the difference and multiplying by 60 gives the minutes from the difference of the hours. Then you can subtract the original minutes and add the final minutes to get total minutes passed.

查看更多
Ridiculous、
4楼-- · 2020-03-27 05:23

You can use library lubridate to find the difference in minutes . Hope this helps. lubridate provides very good functionality for time related data.

library(lubridate)
df$deptime_new <- minutes(df$deptime-df$schedtime)

Data

df <- structure(list(schedtime = c(1455L, 1640L, 1245L, 1715L, 1039L, 
                                   840L), deptime = c(1455L, 1640L, 1245L, 1709L, 1035L, 839L), 
                     distance = c(184L, 213L, 229L, 229L, 229L, 228L), flightnumber = 
                       c(5935L, 
                         6155L, 7208L, 7215L, 7792L, 7800L), weather = c(0L, 0L, 0L, 
                                                                         0L, 0L, 0L), dayweek = c(4L, 4L, 4L, 4L, 4L, 4L), daymonth = c(1L, 
                                                                                                                                        1L, 1L, 1L, 1L, 1L)), row.names = c(NA, 6L), class = "data.frame")
查看更多
Ridiculous、
5楼-- · 2020-03-27 05:27

I have the same query, Is there a way to calculate the time difference of times in a column and display the answers in a new column in minutes

查看更多
登录 后发表回答