I have a few columns that have both floats and strings. I want to be able to select these columns and apply different masks according to their data type.
I have found select_dtypes() method but it runs over the entire dataframe what I need is to be able to do column selection. For example:
df['A'].select_dtypes(exclude=[np.number])
Right now when I try to do this I get
AttributeError: 'Series' object has no attribute 'select_dtypes'
To give more details let's say I have such dataframe:
df = pd.DataFrame([
[-1, 3, 0],
[5, 2, 1],
[-6, 3, 2],
[7, '<blank>', 3 ],
['<blank>', 2, 4],
['<blank>', '<blank>', '<blank>']], columns='A B C'.split())
When I run
df.select_dtypes(exclude=[np.number])
It doesn't give me an error but also nothing happens since it did not find any column which contains only one dtype other than np.number
In the end I want to create a mask with dtype selection such as
mask= df['A'].select_dtypes(exclude=[np.number])
Note : I need this strings not to be changed because at a further step I will render this dataframe to html table so these < blank >
strings will give me white spaces.
You can define a function to apply conversion to numeric, and then filter according to whether the conversion is successful: