Why does Pandas tell me that I have objects, although every item in the selected column is a string — even after explicit conversion.
This is my DataFrame:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 56992 entries, 0 to 56991
Data columns (total 7 columns):
id 56992 non-null values
attr1 56992 non-null values
attr2 56992 non-null values
attr3 56992 non-null values
attr4 56992 non-null values
attr5 56992 non-null values
attr6 56992 non-null values
dtypes: int64(2), object(5)
Five of them are dtype object
. I explicitly convert those objects to strings:
for c in df.columns:
if df[c].dtype == object:
print "convert ", df[c].name, " to string"
df[c] = df[c].astype(str)
Then, df["attr2"]
still has dtype object
, although type(df["attr2"].ix[0]
reveals str
, which is correct.
Pandas distinguishes between int64
and float64
and object
. What is the logic behind it when there is no dtype str
? Why is a str
covered by object
?