I have a dataset imported from a large group of .csv file. The date imports as a factor, but the data is in the following format
, 11, 4480, - 4570,NE, 12525,LB, , 10, , , , 0, 7:26A,26OC11,
, 11, 7090, - 7290,NE, 5250,LB, , 9, , , , 0, 7:28A,26OC11,
, 11, 5050, - 5065,NE, 50,LB, , 7, , , , 0, 7:31A,26OC11,
, 12, 5440, - 5530,NE, 13225,LB, , 6, , , , 0, 8:10A,26OC11,
, 12, 1020, - 1220,NE, 12020,LB, , 14, , , , 0, 8:12A,26OC11,
, 12, 50, - 25,NE, 12040,LB, , 15, , , , 0, 8:13A,26OC11,
4
For example would be 26 Oct 2011. How would I convert these factors to a date and the time to a time. I need to be able to use the time to generate a time interval between records.
Are you sure there are only two letters for the month? That doesn't make any sense!, how do you tell between JUNE and JULY?. If you can get three letters you could do something simple like this.
You could also use levels()[] instead of as.character(), but this should be simpler for now
Now if you also want the time. You can put it all together with this command
You have to be specially careful with the format. You can see a list of what %I, %d and so, means... here http://stat.ethz.ch/R-manual/R-devel/library/base/html/strptime.html
This is how I ended up creating the date i needed
Thanks for all the help
As mentioned , It is unsual to get 2 letters for a month, but you can add the missing letter using some regular expressions. Then you use
dmy
fromlubridate
to convert dates. Here I am usinggsubfn
.However it would be interesting to see how Jul & Jun differ when you got only 2 characters for the month name. Looks unusual.