I can't seem to figure out how to add a % of total column for each date_submitted group to the below pandas pivot table:
In [177]: pass_rate_pivot
date_submitted audit_status
04-11-2014 audited 140
is_adserver 7
rejected 75
unauditable 257
04-18-2014 audited 177
is_adserver 10
pending 44
rejected 30
unauditable 226
04-25-2014 audited 97
is_adserver 5
pending 33
rejected 9
unauditable 355
Name: site_domain, dtype: int64
In [177]: pass_rate_pivot.to_dict()
Out[177]:
{('04-11-2014', 'audited'): 140,
('04-11-2014', 'is_adserver'): 7,
('04-11-2014', 'rejected'): 75,
('04-11-2014', 'unauditable'): 257,
('04-18-2014', 'audited'): 177,
('04-18-2014', 'is_adserver'): 10,
('04-18-2014', 'pending'): 44,
('04-18-2014', 'rejected'): 30,
('04-18-2014', 'unauditable'): 226,
('04-25-2014', 'audited'): 97,
('04-25-2014', 'is_adserver'): 5,
('04-25-2014', 'pending'): 33,
('04-25-2014', 'rejected'): 9,
('04-25-2014', 'unauditable'): 355}
Is this what you want? (for each group dividing the element with the sum of all elements in that group):
If you want to add this as a column, you can indeed
concat
both serieses to one dataframe as suggested by @exp1orer:If
pass_rate_pivot
would already be a dataframe, you could just assign a new column likepass_rate_pivot['pct'] = pass_rate_pivot['original column'].groupby(...
The most natural way is to do it as you create the pivot table. Here I assume that date_submitted is a column (not in the index) using
reset_index
. And make sure that your values are in a column (here I call that 'value_col'). Then