I cleaned 400 excel files and read them into python using pandas and appended all the raw data into one big df.
Then when I try to export it to a csv:
df.to_csv("path",header=True,index=False)
I get this error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc7' in position 20: ordinal not in range(128)
Can someone suggest a way to fix this and what it means?
Thanks
You have
unicode
values in your DataFrame. Files store bytes, which means allunicode
have to be encoded into bytes before they can be stored in a file. You have to specify an encoding, such asutf-8
. For example,If you don't specify an encoding, then the encoding used by
df.to_csv
defaults toascii
in Python2, orutf-8
in Python3.Adding an answer to help myself google it later:
One trick that helped me is to encode a problematic series first, then decode it back to utf-8. Like:
This would get the dataframe to print correctly too.