I have recently encountered a problem in R dealing with the dates. The last day of 2015 (2015-12-31) falls on Thursday, meaning last week of the year only contains 5 days if I consider Sunday as the start day of my week. Now, I would like 2016-01-01 and 2016-01-02, which fall on Friday and Saturday, to be associated with week 53, and start week 1 on 2016-01-03, which falls on Sunday.
require(lubridate)
range <- seq(as.Date('2015-12-26'), by = 1, len = 10)
df <- data.frame(range)
df$WKN <- as.numeric(strftime(df$range, format = "%U")) + 1
df$weekday <- weekdays(df$range)
df$weeknum <- wday(df$range)
This would give me the following result:
df:
range WKN weekday weeknum
2015-12-26 52 Saturday 7
2015-12-27 53 Sunday 1
2015-12-28 53 Monday 2
2015-12-29 53 Tuesday 3
2015-12-30 53 Wednesday 4
2015-12-31 53 Thursday 5
2016-01-01 1 Friday 6
2016-01-02 1 Saturday 7
2016-01-03 2 Sunday 1
2016-01-04 2 Monday 2
Now I would like to have my dataframe as follows:
df:
range WKN weekday weeknum
2015-12-26 52 Saturday 7
2015-12-27 53 Sunday 1
2015-12-28 53 Monday 2
2015-12-29 53 Tuesday 3
2015-12-30 53 Wednesday 4
2015-12-31 53 Thursday 5
2016-01-01 53 Friday 6
2016-01-02 53 Saturday 7
2016-01-03 1 Sunday 1
2016-01-04 1 Monday 2
Could anyone point me to a direction to automate that so that I don't have to change the code every year?
If you check out ?strptime
, there are a few different week number tokens available for use with format
. Here %V
almost works, except it starts the week on Monday, so add one to adjust:
df$WKN <- as.integer(format(df$range + 1, '%V'))
df
## range WKN weekday weeknum
## 1 2015-12-26 52 Saturday 7
## 2 2015-12-27 53 Sunday 1
## 3 2015-12-28 53 Monday 2
## 4 2015-12-29 53 Tuesday 3
## 5 2015-12-30 53 Wednesday 4
## 6 2015-12-31 53 Thursday 5
## 7 2016-01-01 53 Friday 6
## 8 2016-01-02 53 Saturday 7
## 9 2016-01-03 1 Sunday 1
## 10 2016-01-04 1 Monday 2
Or if you're using dplyr like the tag suggests,
library(dplyr)
df %>% mutate(WKN = as.integer(format(range + 1, '%V')))
which returns the same thing. The isoweek
function of lubridate is equivalent, so you could also do
library(lubridate)
df$WKN <- isoweek(df$range + 1)
or
df %>% mutate(WKN = isoweek(range + 1))
both of which return identical results to the as.integer(format(...))
versions.
We can use cumsum
on a logical vector
df$WKN <- unique(df$WKN)[cumsum(df$weeknum==1) +1]
df$WKN
#[1] 52 53 53 53 53 53 53 53 1 1
Considering that you are using lubridate
, I also wanted to give you a lubridate solution. You also asked for a solution that works with other years. Here goes:
adjust_first_week<- function(year){
first <- floor_date(dmy(paste0("1-1-", year)), "year")
two_weeks <- c(first - days(7:1), first + days(0:6))
df <- data.frame(date = two_weeks,
day_of_week = weekdays(two_weeks),
day_of_year = yday(two_weeks),
week_of_year = week(two_weeks))
last_weekend <- which(df$day_of_week == "Sunday")[2] -1
df$adjust_week <- df$week_of_year
if(last_weekend ==7) return(df)
else{
df$adjust_week[8:last_weekend] <- rep(53,length(8:last_weekend))
}
return(df)
}
- Takes a numeric year, and takes the first day of that year.
- Creates a two week period by appending a week on either side of 1/1/year.
- Calculates various summary statistics for that year for your edification.
- Picks out the second Sunday. By design 1/1/year is always the 8th entry.
- If Sunday is the first day of the month, it doesn't do anything.
- Otherwise it overwrites the week of the year so that the first week of the year starts on the second Sunday.
Here is the results for
adjust_last_week(2016)
date day_of_week day_of_year week_of_year adjust_week
1 2015-12-25 Friday 359 52 52
2 2015-12-26 Saturday 360 52 52
3 2015-12-27 Sunday 361 52 52
4 2015-12-28 Monday 362 52 52
5 2015-12-29 Tuesday 363 52 52
6 2015-12-30 Wednesday 364 52 52
7 2015-12-31 Thursday 365 53 53
8 2016-01-01 Friday 1 1 53
9 2016-01-02 Saturday 2 1 53
10 2016-01-03 Sunday 3 1 1
11 2016-01-04 Monday 4 1 1
12 2016-01-05 Tuesday 5 1 1
13 2016-01-06 Wednesday 6 1 1
14 2016-01-07 Thursday 7 1 1