I have a vector with dates in this format (example of the first 6 rows):
Dates<-c(
"Sun Oct 04 20:33:05 EEST 2015",
"Sun Oct 04 20:49:23 EEST 2015",
"Sun Oct 04 21:05:25 EEST 2015",
"Mon Sep 28 10:02:38 IDT 2015",
"Mon Sep 28 10:17:50 IDT 2015",
"Mon Sep 28 10:39:48 IDT 2015")
I tried to read this variable Dates
to R using as.Date()
function:
as.Date(Dates,format = "%a %b %d %H:%M:%S %Z %Y")
but the process failed as %Z
parameter is not supported in the input. The time zones differ throughout the vector. What are the alternatives to read data correctly with respect to the time zone?
This solution requires some simplifying assumptions. Assuming you have many elements in your vector, the best approach is to use a database of timezone offsets to figure out what each time is (in a chosen locale, such as GMT). The timezone data I used is the timezone.csv file from https://timezonedb.com/download
Depending on how exact your data needs to be, be careful to adjust for daylight savings if necessary. Also, I don't know much about time zones, but I noticed that for some reason, certain time zones can have multiple offsets. In the original database, CLT (Chilean time) can vary from 3-5 hours from GMT, for some reason.
For this exercise, my code simply takes the first of each time zone's offset from the database and assumes no daylight savings day. This may be sufficient if your work doesn't require such precision, but you should QA and validate your work either way.
Also, note that this solution should be robust for date changes as well. For example, if the time is adjusted from 1am to 11pm, then the date should revert back one day.