我想计算公式如下:
calc = value_a(2D) - (value_b(0D) + value_b(1D))/10000
value_a(2D) corresponds to type **a**, year **2D** and value **1.1275**
value_b(0D) corresponds to type **b**, year **0D** and value **0**
value_b(1D) corresponds to type **b**, year **1D** and value **0.125**
其结果应该是
1.1274875
但somwhow我不知道怎么我的循环中访问正确的数据? 我想保持我的代码的结构。
代码如下所示:
import pandas as pd
data = pd.read_csv('C:/Book1.csv').fillna('')
pd_date = pd.DatetimeIndex(data['date'].values)
data['date'] = pd_date
index_data = data.set_index('date')
for current_date in index_data.index.unique():
for index, row in index_data.iterrows():
if index == current_date:
for index2, row2 in index_data.iterrows():
if index2 == current_date:
if row['type'] in {'a', 'b'} and row2['type'] in {'a', 'b'}:
if row['year'] in {'0D','1D','2D'}:
print(row['value'])
数据如下所示:
date type year value 2015-02-09 a 2D 1.1275 2015-02-09 b 10M 58.125 2015-02-09 b 11M 68.375 2015-02-09 b 1M 3.345 2015-02-09 b 1W 0.89 2015-02-09 b 1Y 79.375 2015-02-09 b 2M 7.535 2015-02-09 b 2W 1.8 2015-02-09 b 3M 11.61 2015-02-09 b 3W 2.48 2015-02-09 b 4M 16.2 2015-02-09 b 5M 21.65 2015-02-09 b 6M 27.1 2015-02-09 b 7M 33.625 2015-02-09 b 8M 41.375 2015-02-09 b 9M 49.5 2015-02-09 b 0D 0 2015-02-09 b 1D 0.125
它看起来像你真的可以在这里使用了多指标:
In [4]: df.reset_index(inplace=True)
In [5]: df
Out[5]:
type year date value
0 a 2D 2015-02-09 1.1275
1 b 10M 2015-02-09 58.1250
2 b 11M 2015-02-09 68.3750
3 b 1M 2015-02-09 3.3450
4 b 1W 2015-02-09 0.8900
5 b 1Y 2015-02-09 79.3750
6 b 2M 2015-02-09 7.5350
7 b 2W 2015-02-09 1.8000
8 b 3M 2015-02-09 11.6100
9 b 3W 2015-02-09 2.4800
10 b 4M 2015-02-09 16.2000
11 b 5M 2015-02-09 21.6500
12 b 6M 2015-02-09 27.1000
13 b 7M 2015-02-09 33.6250
14 b 8M 2015-02-09 41.3750
15 b 9M 2015-02-09 49.5000
16 b 0D 2015-02-09 0.0000
17 b 1D 2015-02-09 0.1250
In [6]: df.set_index(['type','year'], inplace=True)
In [7]: df
Out[7]:
date value
type year
a 2D 2015-02-09 1.1275
b 10M 2015-02-09 58.1250
11M 2015-02-09 68.3750
1M 2015-02-09 3.3450
1W 2015-02-09 0.8900
1Y 2015-02-09 79.3750
2M 2015-02-09 7.5350
2W 2015-02-09 1.8000
3M 2015-02-09 11.6100
3W 2015-02-09 2.4800
4M 2015-02-09 16.2000
5M 2015-02-09 21.6500
6M 2015-02-09 27.1000
7M 2015-02-09 33.6250
8M 2015-02-09 41.3750
9M 2015-02-09 49.5000
0D 2015-02-09 0.0000
1D 2015-02-09 0.1250
然后简单:
In [8]: df.loc['a','2D'].value - (df.loc['b', '0D'].value + df.loc['b','1D'].value)/10000
Out[8]: 1.1274875
注意,假设我有多年(这个我简单地串联使用df本身造成的):
In [24]: df2
Out[24]:
type year date value
0 a 2D 2015-02-09 1.1275
1 b 10M 2015-02-09 58.1250
2 b 11M 2015-02-09 68.3750
3 b 1M 2015-02-09 3.3450
4 b 1W 2015-02-09 0.8900
5 b 1Y 2015-02-09 79.3750
6 b 2M 2015-02-09 7.5350
7 b 2W 2015-02-09 1.8000
8 b 3M 2015-02-09 11.6100
9 b 3W 2015-02-09 2.4800
10 b 4M 2015-02-09 16.2000
11 b 5M 2015-02-09 21.6500
12 b 6M 2015-02-09 27.1000
13 b 7M 2015-02-09 33.6250
14 b 8M 2015-02-09 41.3750
15 b 9M 2015-02-09 49.5000
16 b 0D 2015-02-09 0.0000
17 b 1D 2015-02-09 0.1250
18 a 2D 2015-02-10 1.1275
19 b 10M 2015-02-10 58.1250
20 b 11M 2015-02-10 68.3750
21 b 1M 2015-02-10 3.3450
22 b 1W 2015-02-10 0.8900
23 b 1Y 2015-02-10 79.3750
24 b 2M 2015-02-10 7.5350
25 b 2W 2015-02-10 1.8000
26 b 3M 2015-02-10 11.6100
27 b 3W 2015-02-10 2.4800
28 b 4M 2015-02-10 16.2000
29 b 5M 2015-02-10 21.6500
30 b 6M 2015-02-10 27.1000
31 b 7M 2015-02-10 33.6250
32 b 8M 2015-02-10 41.3750
33 b 9M 2015-02-10 49.5000
34 b 0D 2015-02-10 0.0000
35 b 1D 2015-02-10 0.1250
In [25]: df.iloc[-2,-1] = 100000 # this corresponds to (b, 0D) and used to be 0
正如@cᴏʟᴅsᴘᴇᴇᴅ指出,可以通过组'date'
列:
In [26]: df2.groupby('date').apply(
...: lambda df:
...: df.loc['a','2D'].value
...: - (df.loc['b', '0D'].value + df.loc['b','1D'].value)
...: / 10000
...: )
Out[27]:
date
2015-02-09 1.127487
2015-02-10 -8.872513
dtype: float64