Applying select_dtypes for selected columns of a d

2019-07-15 03:33发布

I have a few columns that have both floats and strings. I want to be able to select these columns and apply different masks according to their data type.

I have found select_dtypes() method but it runs over the entire dataframe what I need is to be able to do column selection. For example:

 df['A'].select_dtypes(exclude=[np.number]) 

Right now when I try to do this I get

AttributeError: 'Series' object has no attribute 'select_dtypes'

To give more details let's say I have such dataframe:

df = pd.DataFrame([
[-1, 3, 0],
[5, 2, 1],
[-6, 3, 2],
[7, '<blank>', 3 ],     
['<blank>', 2, 4],
['<blank>', '<blank>', '<blank>']], columns='A B C'.split())

When I run

df.select_dtypes(exclude=[np.number]) 

It doesn't give me an error but also nothing happens since it did not find any column which contains only one dtype other than np.number

In the end I want to create a mask with dtype selection such as

mask=  df['A'].select_dtypes(exclude=[np.number]) 

Note : I need this strings not to be changed because at a further step I will render this dataframe to html table so these < blank > strings will give me white spaces.

1条回答
一夜七次
2楼-- · 2019-07-15 03:54

You can define a function to apply conversion to numeric, and then filter according to whether the conversion is successful:

def filter_type(s, num=True):
    s_new = pd.to_numeric(s, errors='coerce')
    if num:
        return s[s_new.notnull()]
    else:
        return s[s_new.isnull()]

res = filter_type(df['A'], num=False)

print(res)

4    <blank>
5    <blank>
Name: A, dtype: object
查看更多
登录 后发表回答