Use first row as column names? Pandas read_html

2019-04-23 22:49发布

I have this simple one line script:

from pandas import read_html

print read_html('http://money.cnn.com/data/hotstocks/', flavor = 'bs4')

Which works, fine, but the column names are missing, they are being identified as 1, 2, 3. Is there an easy way to tell pandas to use the first row as the column names? I know I could just store the names as a list and set them, and then skip the first row, but am wondering if there is an easier/better way.

Currently it prints:

                           0       1       2         3
0                    Company   Price  Change  % Change
1             AAPL Apple Inc  115.31   +6.17    +5.65%
2   BAC Bank of America Corp   15.20   -0.43    -2.75%
3            YHOO Yahoo! Inc   46.46   -1.53    -3.19%
4        MSFT Microsoft Corp   41.19   -1.47    -3.45%
5            FB Facebook Inc   76.24   +0.46    +0.61%
6     GE General Electric Co   23.84   -0.54    -2.21%
7                 T AT&T Inc   32.68   -0.13    -0.40%
8            F Ford Motor Co   14.46   -0.24    -1.63%
9            INTC Intel Corp   33.78   -0.41    -1.20%
10    CSCO Cisco Systems Inc   26.80   -0.09    -0.35%

标签： python parsing pandas

1条回答

Melony?

2楼-- · 2019-04-23 23:44

'read_html` takes a header parameter. You can pass a row index:

read_html('http://money.cnn.com/data/hotstocks/', header =0, flavor = 'bs4')

Worth noting this caveat in the docs:

For example, you might need to manually assign column names if the column names are converted to NaN when you pass the header=0 argument

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.html.read_html.html

0人赞添加讨论(0) 举报

Use first row as column names? Pandas read_html

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间