Python Pandas Error tokenizing data

2019-01-01 00:15发布

I'm trying to use pandas to manipulate a .csv file but I get this error:

pandas.parser.CParserError: Error tokenizing data. C error: Expected 2 fields in line 3, saw 12

I have tried to read the pandas docs, but found nothing.

My code is simple:

path = 'GOOG Key Ratios.csv'
#print(open(path).read())
data = pd.read_csv(path)

How can I resolve this? Should I use the csv module or another language ?

File is from Morningstar

24条回答
还给你的自由
2楼-- · 2019-01-01 01:06

I've had this problem a few times myself. Almost every time, the reason is that the file I was attempting to open was not a properly saved CSV to begin with. And by "properly", I mean each row had the same number of separators or columns.

Typically it happened because I had opened the CSV in Excel then improperly saved it. Even though the file extension was still .csv, the pure CSV format had been altered.

Any file saved with pandas to_csv will be properly formatted and shouldn't have that issue. But if you open it with another program, it may change the structure.

Hope that helps.

查看更多
梦醉为红颜
3楼-- · 2019-01-01 01:07

You can do this step to avoid the problem -

train = pd.read_csv('/home/Project/output.csv' , header=None)

just add - header=None

Hope this helps!!

查看更多
步步皆殇っ
4楼-- · 2019-01-01 01:08

This is definitely an issue of delimiter, as most of the csv CSV are got create using sep='/t' so try to read_csv using the tab character (\t) using separator /t. so, try to open using following code line.

data=pd.read_csv("File_path", sep='\t')
查看更多
永恒的永恒
5楼-- · 2019-01-01 01:08

try: pandas.read_csv(path, sep = ',' ,header=None)

查看更多
梦该遗忘
6楼-- · 2019-01-01 01:10

Your CSV file might have variable number of columns and read_csv inferred the number of columns from the first few rows. Two ways to solve it in this case:

1) Change the CSV file to have a dummy first line with max number of columns (and specify header=[0])

2) Or use names = list(range(0,N)) where N is the max number of columns.

查看更多
几人难应
7楼-- · 2019-01-01 01:11

use pandas.read_csv('CSVFILENAME',header=None,sep=', ')

when trying to read csv data from the link

http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data

I copied the data from the site into my csvfile. It had extra spaces so used sep =', ' and it worked :)

查看更多
登录 后发表回答