I have a pandas DataFrame data
with the following transaction data:
A date
0 M000833 2016-08-01
1 M000833 2016-08-01
2 M000833 2016-08-02
3 M000833 2016-08-02
4 M000511 2016-08-05
I want a new column with the count of number of visits (multiple visits per day should be treated as 1) per consumer.
So I tried this:
import pandas as pd
data['noofvisits'] = data.groupby(['A'])['date'].nunique()
When I just run the statement without assigning it to the DataFrame, I get a pandas series with the desired output. However, the above statement result in:
A date noofvisits
0 M000833 2016-08-01 NaN
1 M000833 2016-08-01 NaN
2 M000833 2016-08-02 NaN
3 M000833 2016-08-02 NaN
4 M000511 2016-08-05 NaN
The expected output is:
A date noofvisits
0 M000833 2016-08-01 2
1 M000833 2016-08-01 2
2 M000833 2016-08-02 2
3 M000833 2016-08-02 2
4 M000511 2016-08-05 1
What is wrong with this approach? Why does the column noofvisits results in NAs rather than the count values?