There was an elegant answer to a question almost like this provided by EdChum. The difference between that question and this is that now the capping needs to be applied to data that had had "GroupBy" performed.
Original Data:
Symbol DTE Spot Strike Vol
AAPL 30.00 100.00 80.00 14.58
AAPL 30.00 100.00 85.00 16.20
AAPL 30.00 100.00 90.00 18.00
AAPL 30.00 100.00 95.00 20.00
AAPL 30.00 100.00 100.00 22.00
AAPL 30.00 100.00 105.00 25.30
AAPL 30.00 100.00 110.00 29.10
AAPL 30.00 100.00 115.00 33.46
AAPL 30.00 100.00 120.00 38.48
AAPL 50.00 102.00 80.00 13.08
AAPL 50.00 102.00 85.00 14.70
AAPL 50.00 102.00 90.00 16.50
AAPL 50.00 102.00 95.00 18.50
AAPL 50.00 102.00 100.00 20.50
AAPL 50.00 102.00 105.00 23.80
AAPL 50.00 102.00 110.00 27.60
AAPL 50.00 102.00 115.00 31.96
AAPL 50.00 102.00 120.00 36.98
IBM 30.00 170.00 150.00 7.29
IBM 30.00 170.00 155.00 8.10
IBM 30.00 170.00 160.00 9.00
IBM 30.00 170.00 165.00 10.00
IBM 30.00 170.00 170.00 11.00
IBM 30.00 170.00 175.00 12.65
IBM 30.00 170.00 180.00 14.55
IBM 30.00 170.00 185.00 16.73
IBM 30.00 170.00 190.00 19.24
IBM 60.00 171.00 150.00 5.79
IBM 60.00 171.00 155.00 6.60
IBM 60.00 171.00 160.00 7.50
IBM 60.00 171.00 165.00 8.50
IBM 60.00 171.00 170.00 9.50
IBM 60.00 171.00 175.00 11.15
IBM 60.00 171.00 180.00 13.05
IBM 60.00 171.00 185.00 15.23
IBM 60.00 171.00 190.00 17.74
I then create a few new variables:
df['ATM_dist'] =abs(df['Spot']-df['Strike'])
imin = df.groupby(['DTE','Symbol'])['ATM_dist'].transform('idxmin')
df['NormStrike']=np.log(df['Strike']/df['Spot'])/(((df['DTE']/365)**.5)*df['ATMvol']/100)
df['ATMvol'] = df.loc[imin,'Vol'].values
The results are below:
Symbol DTE Spot Strike Vol ATM_dist ATMvol NormStrike
0 AAPL 30 100 80 14.58 20 22.0 -3.537916
1 AAPL 30 100 85 16.20 15 22.0 -2.576719
2 AAPL 30 100 90 18.00 10 22.0 -1.670479
3 AAPL 30 100 95 20.00 5 22.0 -0.813249
4 AAPL 30 100 100 22.00 0 22.0 0.000000
5 AAPL 30 100 105 25.30 5 22.0 0.773562
6 AAPL 30 100 110 29.10 10 22.0 1.511132
7 AAPL 30 100 115 33.46 15 22.0 2.215910
8 AAPL 30 100 120 38.48 20 22.0 2.890688
9 AAPL 50 102 80 13.08 22 20.5 -3.201973
10 AAPL 50 102 85 14.70 17 20.5 -2.402955
11 AAPL 50 102 90 16.50 12 20.5 -1.649620
12 AAPL 50 102 95 18.50 7 20.5 -0.937027
13 AAPL 50 102 100 20.50 2 20.5 -0.260994
14 AAPL 50 102 105 23.80 3 20.5 0.382049
15 AAPL 50 102 110 27.60 8 20.5 0.995172
16 AAPL 50 102 115 31.96 13 20.5 1.581035
17 AAPL 50 102 120 36.98 18 20.5 2.141961
18 IBM 30 170 150 7.29 20 11.0 -3.968895
19 IBM 30 170 155 8.10 15 11.0 -2.929137
20 IBM 30 170 160 9.00 10 11.0 -1.922393
21 IBM 30 170 165 10.00 5 11.0 -0.946631
22 IBM 30 170 170 11.00 0 11.0 0.000000
23 IBM 30 170 175 12.65 5 11.0 0.919188
24 IBM 30 170 180 14.55 10 11.0 1.812480
25 IBM 30 170 185 16.73 15 11.0 2.681295
26 IBM 30 170 190 19.24 20 11.0 3.526940
27 IBM 60 171 150 5.79 21 9.5 -3.401827
28 IBM 60 171 155 6.60 16 9.5 -2.550520
29 IBM 60 171 160 7.50 11 9.5 -1.726243
30 IBM 60 171 165 8.50 6 9.5 -0.927332
31 IBM 60 171 170 9.50 1 9.5 -0.152273
32 IBM 60 171 175 11.15 4 9.5 0.600317
33 IBM 60 171 180 13.05 9 9.5 1.331704
34 IBM 60 171 185 15.23 14 9.5 2.043051
35 IBM 60 171 190 17.74 19 9.5 2.735427
I wish to have the values of 'Vol' cap to the level where another column 'NormStrike' hits a trigger (in this case abs(NormStrike) >= 2 ). This new column, 'Desired_Level', created while leaving the 'Vol' column unchanged. The first cap should cause the Vol value at index location 0 to be 16.2 because the cap was triggered at index location 1 when NormStrike hit -2.576719.
Added clarification: I am looking for a generic solution, that works away from the lowest abs(NormStrike) level in both directions to hit both the -2 and the +2 trigger. If it is not hit (which it might not be) then desired level is just original_level
An additional note, it will always be true that the abs(NormStrike) continues to grow in size from the min(abs(NormStrike)) level as it is a function of abs(distance from spot to strike)
the code that EdChum provided (prior to me bringing GroupBy into the mix) is below:
clip = 4
lower = df.loc[df['NS'] <= -clip, 'Vol'].idxmax()
upper = df.loc[df['NS'] >= clip, 'Vol'].idxmin()
df['Original_level'] = df['Original_level'].clip(df.loc[lower,'Original_level'], df.loc[upper, 'Original_level'])
There are 2 issues, first, it did not work after groupby and second, if a particular group of data does not have a NS value that exceeds the "clip" value then it generates an error. The ideal outcome would be, in this case, nothing is done to the Vol level for the particular Symbol/DTE group in question.
Ed suggested implementing a reset_index() but I am not sure how to use that to solve the issue.
I hope this was not to convoluted of a question
thank you for any assistance
You can try this to see whether it works out. I assume if the clip has been triggered, then NaN will be put. You can replace it by your customized choice.