How does the ts()
function use its frequency
parameter? What is the effect of assigning wrong values as frequency
?
I am trying to use 1.5 years of website usage data to build a time series model so that I can forecast the usage for coming periods. I am using data at daily level. What should be the frequency
here - 7 or 365 or 365.25?
The frequency
is "the" period at which seasonal cycles repeat. I use "the" in scare quotes since, of course, there are often multiple cycles in time series data. For instance, daily data often exhibit weekly patterns (a frequency of 7) and yearly patterns (a frequency of 365 or 365.25 - the difference often does not matter).
In your case, I would assume that weekly patterns dominate, so I would assign frequency=7
. If your data exhibits additional patterns, e.g., holiday effects, you can use specialized methods accounting for multiple seasonalities, or work with dummy coding and a regression-based framework.
Here, the frequency
parameter is not a frequency that you can observe in the data of your time series. Instead, you have to specify the frequency at which samples of the time series were taken. In your case, this is simply 1 day, or 1
.
The value you give here will influence the results you get later when running analysis operations (examples are average requests per time unit or fourier transformation to get the (real) frequencies in the data). E.g. if you wanted to get all your results in the unit of hours instead of in days, you would pass 24
instead of 1
as frequency
, because your data samples were taken in a frequency of 24 hours.