I applied sum() on a groupby and I want to sort th

2020-04-10 02:06发布

Given the following DataFrame

user_ID  product_id  amount
   1       456          1
   1        87          1
   1       788          3
   1       456          5
   1        87          2
  ...      ...         ...

The first column is the ID of the customer, the second is the ID of the product he bought and the 'amount' express if the quantity of the product purchased on that given day (the date is also taken into consideration). a customer can buy many products each day as much as he wants to. I want to calculate the total of times each product is bought by the customer, so I applied a groupby

df.groupby(['user_id','product_id'], sort=True).sum()

now I want to sort the sum of amount in each group. Any help?

2条回答
放荡不羁爱自由
2楼-- · 2020-04-10 02:30

Suppose df is:

     user_ID  product_id  amount
0        1         456       1
1        1          87       1
2        1         788       3
3        1         456       5
4        1          87       2
5        2         456       1
6        2         788       3
7        2         456       5

Then you can use, groupby and sum as before, in addition you can sort values by two columns [user_ID, amount] and ascending=[True,False] refers ascending order of user and for each user descending order of amount:

new_df = df.groupby(['user_ID','product_id'], sort=True).sum().reset_index()
new_df = new_df.sort_values(by = ['user_ID', 'amount'], ascending=[True,False])
print(new_df)

Output:

     user_ID   product_id  amount
1        1         456       6
0        1          87       3
2        1         788       3
3        2         456       6
4        2         788       3
查看更多
家丑人穷心不美
3楼-- · 2020-04-10 02:35

You could also use aggregate():

# Make up some example data
df = data.frame (user_ID = as.factor(rep(1:5, each = 5)), 
                 product_id = as.factor(sample(seq(1:10),size = 25, replace = TRUE)),
                 amount = sample(1:5, size = 25, replace = TRUE))

# Use aggregate with function sum to calculate the amount of products bought by product and customer
aggregate(amount ~  product_id * user_ID , data = df, FUN = sum)

Output:

   product_id user_ID amount
1           2       1      3
2           4       1      2
3           6       1      1
4           9       1      5
5           1       2      5
6           3       2      9
7           8       2      1
8          10       2      5
9           2       3      5
10          3       3      5
11          4       3      5
12          5       3      3
13          8       3      5
14          3       4      3
15          4       4      9
16          5       4      2
17         10       4      1
18          2       5      1
19          4       5      4
20          5       5      2
21         10       5      2
查看更多
登录 后发表回答