Group and sum list of dictionaries by parameter

2020-07-29 06:10发布

问题:

I have a list of dictionaries of my products (drinks, food, etc), some of the products may be added several times. I need to group my products by product_id parameter and sum product_cost and product_quantity of each group to get the total product price.

I'm a newbie in python, understand how to group list of dictionaries but can't figure out how to sum some parameter values.

"products_list": [
    {
        "product_cost": 25,
        "product_id": 1,
        "product_name": "Coca-cola",
        "product_quantity": 14,
    },
    {
        "product_cost": 176.74,
        "product_id": 2,
        "product_name": "Apples",
        "product_quantity": 800,

    },
    {
        "product_cost": 13,
        "product_id": 1,
        "product_name": "Coca-cola",
        "product_quantity": 7,
    }
]

I need to achieve something like that:

"products_list": [
    {
        "product_cost": 38,
        "product_id": 1,
        "product_name": "Coca-cola",
        "product_quantity": 21,
    },
    {
        "product_cost": 176.74,
        "product_id": 2,
        "product_name": "Apples",
        "product_quantity": 800,

    }
]

回答1:

You can start by sorting the list of dictionaries on product_name, and then group items based on product_name

Then for each group, calculate the total product and total quantity, create your final dictionary and update to the list, and then make your final dictionary

from itertools import groupby

dct = {"products_list": [
    {
        "product_cost": 25,
        "product_id": 1,
        "product_name": "Coca-cola",
        "product_quantity": 14,
    },
    {
        "product_cost": 176.74,
        "product_id": 2,
        "product_name": "Apples",
        "product_quantity": 800,

    },
    {
        "product_cost": 13,
        "product_id": 1,
        "product_name": "Coca-cola",
        "product_quantity": 7,
    }
]}

result = {}
li = []

#Sort product list on product_name
sorted_prod_list = sorted(dct['products_list'], key=lambda x:x['product_name'])

#Group on product_name
for model, group in groupby(sorted_prod_list,key=lambda x:x['product_name']):

    grp = list(group)

    #Compute total cost and qty, make the dictionary and add to list
    total_cost = sum(item['product_cost'] for item in grp)
    total_qty = sum(item['product_quantity'] for item in grp)
    product_name = grp[0]['product_name']
    product_id = grp[0]['product_id']

    li.append({'product_name': product_name, 'product_id': product_id, 'product_cost': total_cost, 'product_quantity': total_qty})

#Make final dictionary
result['products_list'] = li

print(result)

The output will be

{
    'products_list': [{
            'product_name': 'Apples',
            'product_id': 2,
            'product_cost': 176.74,
            'product_quantity': 800
        },
        {
            'product_name': 'Coca-cola',
            'product_id': 1,
            'product_cost': 38,
            'product_quantity': 21
        }
    ]
}


回答2:

You can try with pandas:

d = {"products_list": [
    {
        "product_cost": 25,
        "product_id": 1,
        "product_name": "Coca-cola",
        "product_quantity": 14,
    },
    {
        "product_cost": 176.74,
        "product_id": 2,
        "product_name": "Apples",
        "product_quantity": 800,

    },
    {
        "product_cost": 13,
        "product_id": 1,
        "product_name": "Coca-cola",
        "product_quantity": 7,
    }
]}
df=pd.DataFrame(d["products_list"])

Pass dict to pandas and perform groupby. Then convert it back to dict with to_dict function.

result={}
result["products_list"]=df.groupby("product_name",as_index=False).sum().to_dict(orient="records")

Result:

{'products_list': [{'product_cost': 176.74,
   'product_id': 2,
   'product_name': 'Apples',
   'product_quantity': 800},
  {'product_cost': 38.0,
   'product_id': 2,
   'product_name': 'Coca-cola',
   'product_quantity': 21}]}


回答3:

Me personally I would reorganize it in to another dictionary by unique identifiers. Also, if you still need it in the list format you can still reorganize it in a dictionary, but you can just convert the dict.values() in to a list. Below is a function that does that.

def get_totals(product_dict):
    totals = {}
    for product in product_list["product_list"]:
        if product["product_name"]  not in totals:
            totals[product["product_name"]] = product
        else:

            totals[product["product_name"]]["product_cost"] += product["product_cost"]
            totals[product["product_name"]]["product_quantity"] += product["product_quantity"]

    return list(totals.values())

output is:

[
 {
  'product_cost': 38,
  'product_id': 1,
  'product_name': 'Coca-cola', 
  'product_quantity': 21
 },
 {
  'product_cost': 176.74,
  'product_id': 2, 
  'product_name': 'Apples',
  'product_quantity': 800
 }
]

Now if you need it to belong to a product list key. Just reassign the list to the same key. Instead of returning list(total.values()) do

product_dict["product_list"] = list(total.values())
return product_dict

The output is a dictionary like:

{
 "products_list": [
   {
    "product_cost": 38,
    "product_id": 1,
    "product_name": "Coca-cola",
    "product_quantity": 21,
   },
   {
    "product_cost": 176.74,
    "product_id": 2,
    "product_name": "Apples",
    "product_quantity": 800,

   }
 ]
}