remove control character whitespaces from datafram

2019-05-25 07:37发布

问题:

I have a dataframe df by which I am getting list of list by using this

data = [list(map(str,n.tolist())) for n in df.values]

after that I replace specific control character from data like this

data = [ [e.replace(u'\xa0', u'') for e in tempval ] for tempval in data ]

This works fine but I want this to be done in dataframe itself please suggest something.

回答1:

You can use DataFrame.replace:

df = pd.DataFrame({'A':['\xa0','s','w'],
                   'B':['s','w','v'],
                   'C':['e','d','\xa0']})

print (df)
   A  B  C
0     s  e
1  s  w  d
2  w  v  

Then for creating list of lists convert DataFrame to numpy array by values and then tolist:

df.replace(u'\xa0',u'', regex=True, inplace=True)
#if need cast all values to str add astype
print (df.astype(str).values.tolist())
[['', 's', 'e'], ['s', 'w', 'd'], ['w', 'v', '']]