I have this simple one line script:
from pandas import read_html
print read_html('http://money.cnn.com/data/hotstocks/', flavor = 'bs4')
Which works, fine, but the column names are missing, they are being identified as 1, 2, 3. Is there an easy way to tell pandas to use the first row as the column names? I know I could just store the names as a list and set them, and then skip the first row, but am wondering if there is an easier/better way.
Currently it prints:
0 1 2 3
0 Company Price Change % Change
1 AAPL Apple Inc 115.31 +6.17 +5.65%
2 BAC Bank of America Corp 15.20 -0.43 -2.75%
3 YHOO Yahoo! Inc 46.46 -1.53 -3.19%
4 MSFT Microsoft Corp 41.19 -1.47 -3.45%
5 FB Facebook Inc 76.24 +0.46 +0.61%
6 GE General Electric Co 23.84 -0.54 -2.21%
7 T AT&T Inc 32.68 -0.13 -0.40%
8 F Ford Motor Co 14.46 -0.24 -1.63%
9 INTC Intel Corp 33.78 -0.41 -1.20%
10 CSCO Cisco Systems Inc 26.80 -0.09 -0.35%
'read_html` takes a header parameter. You can pass a row index:
Worth noting this caveat in the docs:
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.html.read_html.html