I think I posted similar question before. But this time I am struggling with data ID.
My data looks like
date Stock value standard_deviation
01/01/2015 VOD 18 ...
01/01/2015 VOD 15 ...
01/01/2015 VOD 5 ...
03/01/2015 VOD 66 ...
03/01/2015 VOD 7 ...
04/01/2015 VOD 19 ...
04/01/2015 VOD 7 ...
05/01/2015 VOD 3 ...
06/01/2015 VOD 7 ...
..... ... ... ...
01/01/2015 RBS 58 ...
01/01/2015 RBS 445 ...
01/01/2015 RBS 44 ...
03/01/2015 RBS 57 ...
I need to work out the moving average/std deviation for each stock based on (-3,+3) trading days.
Since those are trading days (not calendar days), and there are different number of trades in each day, I created a sub-query and applied the following code.
data want;
set input;
by date;
retain gdate;
if first.date then gdate+1;
run;
proc sort data=want; by stock gdate ; run;
proc sql;
create table want1 as
select
h.stock,
h.date,
h.value,
( select std(s.value) from want s
where h.gdate between s.gdate-2 and s.gdate+2) as std
from
want h
group by stock;
quit;
I tried group by stock
. However, the code ignored the stock group and only gave me the moving std of the whole period. I need the moving std for different stocks.
Anyone can give me some idea ? Thanks !
Let's get you familiarized with
PROC EXPAND
! It's going to be your new best friend in time series.PROC EXPAND
allows you to do basically all of the common transformations (and even ones you didn't know existed) on your data.First, to answer your question:
Step 1: Combine all the values into one trading day per stock
Step 2: Use PROC EXPAND to compute a +/- 3 day centered moving standard deviation
Step 3: Merge back to the original table
Explanation
We are exploiting the use of by-group processing and an existing procedure to do the bulk of the work for us. SAS likes doesn't normally like to look forward due to how the language was designed, and
PROC EXPAND
is one of the very few procedures that is able to look forward in data without a lot of extra work. Another bonus of this procedure is that it doesn't break if there are gaps in the time series, so you can perform operations on any kind of sequential data.One of the transformation operations,
cmovstd
, will apply a centered moving standard deviation on the data for us in order to achieve gathering future data for the moving standard deviation. Note that we chose a window of 7 to get a +/- 3 day centered moving standard deviation. That is because we need:Or, a total of 7 days in our window. If you wanted a +/- 2 day centered moving standard deviation, your window would be 5:
If you choose an even number, you will have 1 or more lagged days to make the window choice valid. For example, a window of 4 will yield:
PROC EXPAND
is like the Swiss Army knife for time series. It will interpolate, extrapolate, transform, and convert between time periods all in one step. You may find it most useful in the following situations:1. Applying a moving (average, std, etc.)
2. Filling in time gaps