Is there a simple way to change a column of yes/no

2019-01-17 07:37发布

问题:

I read a csv file into a pandas dataframe, and would like to convert the columns with binary answers from strings of yes/no to integers of 1/0. Below, I show one of such columns ("sampleDF" is the pandas dataframe).

In [13]: sampleDF.housing[0:10]
Out[13]:
0     no
1     no
2    yes
3     no
4     no
5     no
6     no
7     no
8    yes
9    yes
Name: housing, dtype: object

Help is much appreciated!

回答1:

method 1

sample.housing.eq('yes').mul(1)

method 2

pd.Series(np.where(sample.housing.values == 'yes', 1, 0),
          sample.index)

method 3

sample.housing.map(dict(yes=1, no=0))

method 4

pd.Series(map(lambda x: dict(yes=1, no=0)[x],
              sample.housing.values.tolist()), sample.index)

method 5

pd.Series(np.searchsorted(['no', 'yes'], sample.housing.values), sample.index)

All yield

0    0
1    0
2    1
3    0
4    0
5    0
6    0
7    0
8    1
9    1

timing
given sample

timing
long sample
sample = pd.DataFrame(dict(housing=np.random.choice(('yes', 'no'), size=100000)))



回答2:

Try this:

sampleDF['housing'] = sampleDF['housing'].map({'yes': 1, 'no': 0})


回答3:

# produces True/False
sampleDF['housing'] = sampleDF['housing'] == 'yes'

The above returns True/False values which are essentially 1/0, respectively. Booleans support sum functions, etc. If you really need it to be 1/0 values, you can use the following.

housing_map = {'yes': 1, 'no': 0}
sampleDF['housing'] = sampleDF['housing'].map(housing_map)


回答4:

%timeit
sampleDF['housing'] = sampleDF['housing'].apply(lambda x: 0 if x=='no' else 1)

1.84 ms ± 56.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Replaces 'yes' with 1, 'no' with 0 for the df column specified.



回答5:

Generic way:

import pandas as pd
string_data = string_data.astype('category')
numbers_data = string_data.cat.codes

reference: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.astype.html



回答6:

Try the following:

sampleDF['housing'] = sampleDF['housing'].str.lower().replace({'yes': 1, 'no': 0})