Given a table of revenue values thus:
A key point to note (and the core of my question) is the the brand name will almost always, but not always, contain the corresponding product name. In the case of last Banana entry, it doesn't.
I will extract a dict
of Brand<->Revenue pairs, fist accounting for those brands that have multiple entries, and summing in those cases, using the approach described here. So:
revenuePerBrandDict = {}
brandRevenueTuples = []
i=0
for brand in ourTab.columns[1][1:-1]: # ignore first (zeroth) and last row
brandRevenueTuples.append((campaign.value, round(ourTab.columns[3][i].value,2)))
i+=1
for key, value in brandRevenueTuples:
revenuePerBrandDict[key] = revenuePerBrandDict.get(key, 0) + value
I will then cross-reference the keys and values in this dict to each dict in (dict of banana expenses, dict of kiwi expenses etc.), and subtract expenses from revenue, item per item. These dicts will be extracted from banana table, kiwi table etc. that look like this:
If the brand name always contained the product name in the revenue table, then in order to compile an appropriate collection of revenue values for comparison with the Banana expenses dict, for example, I would just extract all those brands whose name contained 'Banana', and for matching keys in the Banana expenses dict, perform the extraction on their values.
But it doesn't, so I need another way of knowing that in the Revenue dict, 'OtherBrand' is a Banana. (In the Banana dict, I already know it is a Banana, because it came from the Banana table). Instead of extracting a dict
of Brand<->Revenue pairs, I could extract a list or tuple of (tuples of (Product, Brand, Revenue)), and now we have the additional information provided by the Product column. But since a tuple doesn't have the concept of a key, how do I iterate across this new collection, extracting revenue per tuple in the desired way (i.e. with recognition that that OtherBrand is a Banana etc.)
You can use the fruits as keys and group the brands:
Which using your input outputs:
You can then subtract the expenses using the keys.
Using pandas life is even easier you can groupby and sum:
Output:
Or get the groups by fruit and brand:
It seems to me that you want to group your data from the first table by product type. I suggest a dictionary where the key is the product type and the value is a list of tuples
[(brand, revenue),(..., ...)]
.Then, for each product type in the dictionary, you can easily pull out the list of brands for that product and, if needed, make a new dictionary containing lists of 3-tuples
(brand, revenue, expenses)
.