I have a large Dataframe that looks similar to this:
ID_Code Status1 Status2
0 A Done Not
1 A Done Done
2 B Not Not
3 B Not Done
4 C Not Not
5 C Not Not
6 C Done Done
What I want to do is calculate is for each of the set of duplicate ID codes, find out the percentage of Not-Not entries are present. (i.e. [# of Not-Not/# of total entries] * 100)
I'm struggling to do so using groupby and can't seem to get the right syntax to perform this.
Using
sum
and aboolean
mask:IIUC using
crosstab
I may have misunderstood the question, but you appear to be referring to when values of
Status1
andStatus2
are bothNot
, correct? If that's the case, you can do something like: