Converting Factor to Date in R

2019-09-04 21:49发布

问题:

I have a dataset imported from a large group of .csv file. The date imports as a factor, but the data is in the following format

,    11,   4480, - 4570,NE,  12525,LB, ,    10,      ,  ,    ,    0, 7:26A,26OC11,        
 ,    11,   7090, - 7290,NE,   5250,LB, ,     9,      ,  ,    ,    0, 7:28A,26OC11,        
 ,    11,   5050, - 5065,NE,     50,LB, ,     7,      ,  ,    ,    0, 7:31A,26OC11,        
 ,    12,   5440, - 5530,NE,  13225,LB, ,     6,      ,  ,    ,    0, 8:10A,26OC11,        
 ,    12,   1020, - 1220,NE,  12020,LB, ,    14,      ,  ,    ,    0, 8:12A,26OC11,        
 ,    12,     50, -   25,NE,  12040,LB, ,    15,      ,  ,    ,    0, 8:13A,26OC11,      
4

For example would be 26 Oct 2011. How would I convert these factors to a date and the time to a time. I need to be able to use the time to generate a time interval between records.

回答1:

Are you sure there are only two letters for the month? That doesn't make any sense!, how do you tell between JUNE and JULY?. If you can get three letters you could do something simple like this.

as.Date(as.character(mydata$mydate), format = '%d%b%y')

You could also use levels()[] instead of as.character(), but this should be simpler for now

Now if you also want the time. You can put it all together with this command

as.POSIXct(strptime(paste(as.character(mydata$mydate), paste(as.character(mydata$mytime), "M", sep = "")), "%d%b%y %I:%M%p"))

You have to be specially careful with the format. You can see a list of what %I, %d and so, means... here http://stat.ethz.ch/R-manual/R-devel/library/base/html/strptime.html



回答2:

a <- c("26OC11", "01JA12")
month.abb.2 <- toupper(substr(month.abb, 0, 2))
for (i in seq_along(month.abb.2))
  a <- sub(month.abb.2[i], month.abb[i], a)
as.Date(a, format="%d%b%y")
# [1] "2011-10-26" "2012-01-01"

However it would be interesting to see how Jul & Jun differ when you got only 2 characters for the month name. Looks unusual.



回答3:

As mentioned , It is unsual to get 2 letters for a month, but you can add the missing letter using some regular expressions. Then you use dmy from lubridate to convert dates. Here I am using gsubfn.

library(lubridate)
library(gsubfn)
dmy(gsubfn("OC|JA",list(OC="OCT",JA="JAN"),   ## You can extend here for other months
   c("26OC11","26JA12")))

[1] "2011-10-26 UTC" "2012-01-26 UTC"


回答4:

This is how I ended up creating the date i needed

Day<-substring(Date,1,2)
  Month<-substring(Date,3,4)
  Year<-substring(Date,5,6)
  Month<-replace(Month,Month=="AU",8)
  Month<-replace(Month,Month=="JA",1)
  Month<-replace(Month,Month=="FE",2)
  Month<-replace(Month,Month=="MR",3)
  Month<-replace(Month,Month=="AP",4)
  Month<-replace(Month,Month=="MY",5)
  Month<-replace(Month,Month=="JN",6)
  Month<-replace(Month,Month=="JL",7)
  Month<-replace(Month,Month=="SE",9)
  Month<-replace(Month,Month=="OC",10)
  Month<-replace(Month,Month=="NO",11)
  Month<-replace(Month,Month=="DE",12)
  Date2 <- as.Date( paste( Month , Day , Year, sep = "." )  , format = "%m.%d.%y" )
  dataset$Day<-Day
  dataset$Month<-Month
  dataset$Year<-Year
  dataset$Date2<-Date2
  Weekday<-weekdays(Date2)
  dataset$Weekday<-as.factor(Weekday)

Thanks for all the help



标签: r date r-factor