How can I keep midnight (00:00h) using strptime()

2020-01-29 12:20发布

I have a dataframe, df, which has factor variable for date in the following format:

2015-12-15 10:00:00
2015-12-19 12:00:00
2015-12-20 20:00:00

It is hourly data. The problem arises when looking at midnight, 00:00:00, because it doesn't appear the hour. It just says:

21/12/2015

So as you see, it only says the day but it lacks the hour. So I use strptime to convert to a date format using:

df$date <- strptime(df$date,"%d/%m/%Y %H:%M")

It all works fine for all the hours and days except for any day at midnight, 00:00:00, which returns:

NA

I'd really appreciate some help as I've been looking at previous posts in StackOverflow and other forums but I havent' managed to figure out the solution for this specific problem yet.

2条回答
ら.Afraid
2楼-- · 2020-01-29 12:43

From R's strptime documentation (emphasis added):

format

A character string. The default for the format methods is "%Y-%m-%d %H:%M:%S" if any element has a time component which is not midnight, and "%Y-%m-%d" otherwise. If options("digits.secs") is set, up to the specified number of digits will be printed for seconds.

So the information is still there, you just need to format it to print it out with the time components.

> midnight <- strptime("2015-12-19 00:00:00","%Y-%m-%d %H:%M")
> midnight
[1] "2015-12-19 EST"
> format(midnight,"%Y/%m/%d %H:%M")
[1] "2015/12/19 00:00"
查看更多
劫难
3楼-- · 2020-01-29 12:57

If we have a vector like "v1", by using strptime we get NA for those elements that don't have the correct format

strptime(v1,  "%d/%m/%Y %H:%M:%S", tz = "UTC")
#[1] "2015-12-19 12:00:00 UTC" NA  

One way to correct this will be to paste the "00:00:00" string for those that doesn't have that

v1[!grepl(":", v1)] <- paste(v1[!grepl(":", v1)], "00:00:00") 
strptime(v1,  "%d/%m/%Y %H:%M:%S", tz = "UTC")
#[1] "2015-12-19 12:00:00 UTC" "2015-12-19 00:00:00 UTC"

Or if we use lubridate, the parse_date_time can take multiple formats

library(lubridate)
parse_date_time(v1, guess_formats(v1, c("%d/%m/%Y %H:%M:%S", "%d/%m/%Y")))
#[1] "2015-12-19 12:00:00 UTC" "2015-12-19 00:00:00 UTC"

data

v1 <- c("19/12/2015 12:00:00", "19/12/2015") 
查看更多
登录 后发表回答