Apply var to the last n numbers of a column in a t

2019-08-06 11:41发布

问题:

In KDB I have a table with 4 columns, of which one is a returns column. I would like to create a 5th column that calculates the variance (var) of the past x elements of a specified column. I have managed this with two columns creating the returns of a price column, but am stuck when I need to reference more than one previous element.

Example:

t:([] td:2001.01.01 2001.01.02 2001.01.03 2001.01.04 2001.01.05 2001.01.06; px:121 125 127 126 129 130)
t:update retLogPcnt:100*log px%prev px from t
/t:update retClassic:((px-prev px)%prev px) from t
t:update mvAvgRet:2 mavg retLogPcnt from t
t

Output of last line:

td         px  retLogPcnt mvAvgRet 
-----------------------------------
2001.01.01 121                     
2001.01.02 125 3.252319   3.252319 
2001.01.03 127 1.587335   2.419827 
2001.01.04 126 -0.790518  0.3984085
2001.01.05 129 2.35305    0.7812659
2001.01.06 130 0.7722046  1.562627 

Desired output if calculating, as an illustration only, the var of the three last retLogPcnt numbers:

td         px  retLogPcnt mvAvgRet  varRetns
---------------------------------------------
2001.01.01 121                     
2001.01.02 125 3.252319   3.252319  
2001.01.03 127 1.587335   2.419827  
2001.01.04 126 -0.790518  0.3984085 2.752321
2001.01.05 129 2.35305    0.7812659 1.791392
2001.01.06 130 0.7722046  1.562627  1.647022

So the first var number has been calculated by:

q)var 3.252319 1.587335 -0.790518

How can I add this extra column? I made some attempts similar to my moving average example above (t:update mvAvgRet:2 mavg retLogPcnt from t) however I was unsuccessful.

Apologies if this has been answered previously; I did search the forums but could not find anything that helped me. This may well be due to getting to grips with kdb terminology.

回答1:

You may want to check out this page on the wiki that discusses sliding windows. Where it describes the sliding window function swin with this example:

q)swin:{[f;w;s] f each { 1_x,y }\[w#0;s]}
q)swin[avg; 3; til 10]
0 0.33333333 1 2 3 4 5 6 7 8

This can be modified for your purpose by using a list of floats to start with:

swin:{[f;w;s] f each { 1_x,y }\[w#0f;s]}

Such that:

q)update varRetns:swin[var;3;retLogPcnt] from t
td         px  retLogPcnt mvAvgRet varRetns
-------------------------------------------
2001.01.01 121                     0
2001.01.02 125 3.25232    3.25232  2.64439
2001.01.03 127 1.58733    2.41983  0.693043
2001.01.04 126 -0.790518  0.398408 2.75232
2001.01.05 129 2.35305    0.781266 1.79139
2001.01.06 130 0.772205   1.56263  1.64702

Which is close to your example output except for the initial 3 values not being null. If you wish to exclude these values a further change is required to the var function, which I have defined as newVar:

newVar:{$[any null x;0Nf;var x]}

q)update varRetns:swin[newVar;3;retLogPcnt] from t
td         px  retLogPcnt mvAvgRet varRetns
-------------------------------------------
2001.01.01 121
2001.01.02 125 3.25232    3.25232
2001.01.03 127 1.58733    2.41983
2001.01.04 126 -0.790518  0.398408 2.75232
2001.01.05 129 2.35305    0.781266 1.79139
2001.01.06 130 0.772205   1.56263  1.64702


标签: kdb q-lang