Pandas - AttributeError: 'DataFrame' objec

2020-06-29 04:06发布

I am trying to create a new column in an dataframe, by creating a dictionary based on an existing column and calling the 'map' function on the column. It seemed to be working for quite some time. However, the notebook started throwing

AttributeError: 'DataFrame' object has no attribute 'map'

I haven't changed the kernel or the python version. Here's the code i am using.

dict= {1:A,
       2:B,
       3:C,
       4:D,
       5:E}

# Creating an interval-type 
data['new'] = data['old'].map(dict)

how to fix this?

2条回答
家丑人穷心不美
2楼-- · 2020-06-29 04:54

Main problem is after selecting old column get DataFrame instead Series, so map implemented yet to Series failed.

Here should be duplicated column old, so if select one column it return all columns old in DataFrame:

df = pd.DataFrame([[1,3,8],[4,5,3]], columns=['old','old','col'])
print (df)
   old  old  col
0    1    3    8
1    4    5    3

print(df['old'])
   old  old
0    1    3
1    4    5

#dont use dict like variable, because python reserved word
df['new'] = df['old'].map(d)
print (df)

AttributeError: 'DataFrame' object has no attribute 'map'

Possible solution for deduplicated this columns:

s = df.columns.to_series()
new = s.groupby(s).cumcount().astype(str).radd('_').replace('_0','')
df.columns += new
print (df)
   old  old_1  col
0    1      3    8
1    4      5    3

Another problem should be MultiIndex in column, test it by:

mux = pd.MultiIndex.from_arrays([['old','old','col'],['a','b','c']])
df = pd.DataFrame([[1,3,8],[4,5,3]], columns=mux)
print (df)
  old    col
    a  b   c
0   1  3   8
1   4  5   3

print (df.columns)
MultiIndex(levels=[['col', 'old'], ['a', 'b', 'c']],
           codes=[[1, 1, 0], [0, 1, 2]])

And solution is flatten MultiIndex:

#python 3.6+
df.columns = [f'{a}_{b}' for a, b in df.columns]
#puthon bellow
#df.columns = ['{}_{}'.format(a,b) for a, b in df.columns]
print (df)
   old_a  old_b  col_c
0      1      3      8
1      4      5      3

Another solution is map by MultiIndex with tuple and assign to new tuple:

df[('new', 'd')] = df[('old', 'a')].map(d)
print (df)
  old    col new
    a  b   c   d
0   1  3   8   A
1   4  5   3   D

print (df.columns)
MultiIndex(levels=[['col', 'old', 'new'], ['a', 'b', 'c', 'd']],
           codes=[[1, 1, 0, 2], [0, 1, 2, 3]])
查看更多
霸刀☆藐视天下
3楼-- · 2020-06-29 05:03

map is a method that you can call on a pandas.Series object. This method doesn't exist on pandas.DataFrame objects.

df['new'] = df['old'].map(d)

In your code ^^^ df['old'] is returning a pandas.Dataframe object for some reason.

  • As @jezrael points out this could be due to having more than one old column in the dataframe.
  • Or perhaps your code isn't quite the same as the example you have given.

  • Either way the error is there because you are calling map() on a pandas.Dataframe object

查看更多
登录 后发表回答