Converting year and month (“yyyy-mm” format) to a

2018-12-31 01:34发布

I have a dataset that looks like this:

Month    count
2009-01  12
2009-02  310
2009-03  2379
2009-04  234
2009-05  14
2009-08  1
2009-09  34
2009-10  2386

I want to plot the data (months as x values and counts as y values). Since there are gaps in the data, I want to convert the Information for the Month into a date. I tried:

as.Date("2009-03", "%Y-%m")

But it did not work. Whats wrong? It seems that as.Date() requires also a day and is not able to set a standard value for the day? Which function solves my problem?

7条回答
孤独寂梦人
2楼-- · 2018-12-31 02:25

Indeed, as has been mentioned above (and elsewhere on SO), in order to convert the string to a date, you need a specific date of the month. From the as.Date() manual page:

If the date string does not specify the date completely, the returned answer may be system-specific. The most common behaviour is to assume that a missing year, month or day is the current one. If it specifies a date incorrectly, reliable implementations will give an error and the date is reported as NA. Unfortunately some common implementations (such as glibc) are unreliable and guess at the intended meaning.

A simple solution would be to paste the date "01" to each date and use strptime() to indicate it as the first day of that month.


For those seeking a little more background on processing dates and times in R:

In R, times use POSIXct and POSIXlt classes and dates use the Date class.

Dates are stored as the number of days since January 1st, 1970 and times are stored as the number of seconds since January 1st, 1970.

So, for example:

d <- as.Date("1971-01-01")
unclass(d)  # one year after 1970-01-01
# [1] 365

pct <- Sys.time()  # in POSIXct
unclass(pct)  # number of seconds since 1970-01-01
# [1] 1450276559
plt <- as.POSIXlt(pct)
up <- unclass(plt)  # up is now a list containing the components of time
names(up)
# [1] "sec"    "min"    "hour"   "mday"   "mon"    "year"   "wday"   "yday"   "isdst"  "zone"  
# [11] "gmtoff"
up$hour
# [1] 9

To perform operations on dates and times:

plt - as.POSIXlt(d)
# Time difference of 16420.61 days

And to process dates, you can use strptime() (borrowing these examples from the manual page):

strptime("20/2/06 11:16:16.683", "%d/%m/%y %H:%M:%OS")
# [1] "2006-02-20 11:16:16 EST"

# And in vectorized form:
dates <- c("1jan1960", "2jan1960", "31mar1960", "30jul1960")
strptime(dates, "%d%b%Y")
# [1] "1960-01-01 EST" "1960-01-02 EST" "1960-03-31 EST" "1960-07-30 EDT"
查看更多
登录 后发表回答