-->

Column selection with iloc, with both individual i

2020-04-12 02:07发布

问题:

I wonder why this line returns "invalid syntax", and what's the correct syntax to use for selecting both isolated columns and ranges in one go:

X = f1.iloc[:, [2,5,[10:19]]].values

Btw the same happens with:

X = f1.iloc[:, [2,5,10:19]].values

Thanks.

回答1:

Second is correct syntax, only need numpy.r_ for concanecate indices:

np.random.seed(2019)

f1 = pd.DataFrame(np.random.randint(10, size=(5, 25))).add_prefix('a')
print(f1)
   a0  a1  a2  a3  a4  a5  ...  a19  a20  a21  a22  a23  a24
0   8   2   5   8   6   8  ...    0    1    6    0    2    6
1   6   3   1   3   5   0  ...    4    8    1    0    6    1
2   8   2   3   0   9   2  ...    7    1    0    7    4    4
3   7   0   8   9   0   7  ...    3    0    8    6    0    2
4   7   3   2   4   9   9  ...    0    8    8    1    4    9

X = f1.iloc[:, np.r_[2,5,10:19]].values
print(X)
[[5 8 5 3 0 2 5 7 8 5 4]
 [1 0 2 9 8 3 7 7 7 0 3]
 [3 2 6 2 1 1 1 1 8 6 2]
 [8 7 7 8 0 5 7 4 1 1 4]
 [2 9 7 2 9 3 8 5 2 5 5]]

Also is possible first convert values to numpy array, then iloc is not necessary:

X = f1.values[:, np.r_[2,5,10:19]]
print(X)
[[5 8 5 3 0 2 5 7 8 5 4]
 [1 0 2 9 8 3 7 7 7 0 3]
 [3 2 6 2 1 1 1 1 8 6 2]
 [8 7 7 8 0 5 7 4 1 1 4]
 [2 9 7 2 9 3 8 5 2 5 5]]