Pandas convert object column to str - column conta

2019-08-09 11:55发布

I have pandas data frame where column type shows as object but when I try to convert to string,

df['column'] = df['column'].astype('str')

UnicodeEncodeError get thrown: *** UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: ordinal not in range(128)

My next approach was to handle the encoding part: df['column'] = filtered_df['column'].apply(lambda x: x.encode('utf-8').strip())

But that gives following error: *** AttributeError: 'float' object has no attribute 'encode'

Whats the best approach to convert this column to string.

Sample of string in the column

Thank you :)
Thank You !!!
responsibilities/assigned job.

1条回答
太酷不给撩
2楼-- · 2019-08-09 12:10

I had the same problem in python 2.7 when trying to run a script that was originally intended for python 3. In python 2.7, the default str functionality is to encode to ASCII, which will apparently not work with your data. This can be replicated in a simple example:

import pandas as pd
df = pd.DataFrame({'column': ['asdf', u'uh ™ oh', 123]})
df['column'] = df['column'].astype('str')

Results in:

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2122' in position 3: ordinal not in range(128)

Instead, you can specify unicode:

df['column'] = df['column'].astype('unicode')

Verify that the number has been converted to a string:

df['column'][2]

This outputs u'123', so it has been converted to a unicode string. The special character ™ has been properly preserved as well.

查看更多
登录 后发表回答