I have a Pandas DataFrame with customer refund reasons. It contains these example data rows:
**case_type** **claim_type**
1 service service
2 service service
3 chargeback service
4 chargeback local_charges
5 service supplier_service
6 chargeback service
7 chargeback service
8 chargeback service
9 chargeback service
10 chargeback service
11 service service_not_used
12 service service_not_used
I would like to compare the customer's reason with some sort of labeled reason. This is no problem, but I would also like to see the total number of records in a specific group (customer reason).
case_claim_type = df[["case_type", "claim_type"]]
case_claim_type.groupby(by=("case_type", "claim_type"))["case_type"].count()
Which gives me this output, for example:
**case_type** **claim_type**
service service 2
supplier_service 1
service_not_used 2
chargeback service 6
local_charges 1
I would also like to have have the sum of the output per case_type. Something like:
**case_type** **claim_type**
service service 2
supplier_service 1
service_not_used 2
total: 5
chargeback service 6
local_charges 1
total: 7
It doesn't necessarily has to be in this last output format, a column with the (aggregated) totals per case_type is also fine.
You can use:
You can create new
DataFrame
by aggregatesum
and addMultiIndex
byMultiIndex.from_tuples
:Then
concat
together and lastsort_index
:Where:
Let use
pd.concat
,sum
with level parameter, andassign
:Output: