Itertools Groupby looping over different columns

2019-08-20 07:34发布

问题:

'm trying to do a conditional sum-product in Python. The simplified idea is as follows:

A = [1 1 2 3 3 3]
B = [0.50 0.25 0.99 0.80 0.70 0.20]

I would like to have as output

Total1 = 0.50*1 + 0.25*1
Total2 = 0.99*2
Total3 = 0.80*3 + 0.70*3 + 0.20*3 

Thanks to the support by people over here, this part worked out!

Next function I like to add, is being able to calculate this for different columns 'B' (say B1, B2, B3, ...) (with different values). These are stored in Excel and I read them out to different lists with openpyxl (can probably be more efficient...) This means the values in B1/B2/... are corresponding with the respective values in A.

number = -1
j = 0
for col in ws.iter_cols():
    if col[3].value == "fast" :
        number = j
    j+=1

B1 = [row[number].value for row in ws.iter_rows(min_row=5, max_row=63332) ]
B1_float = [float(i) for i in B1]

Is there a way to perform this script to different combinations (A&B1 / A&B2 / A&B3 / ...) and store them in a matrix? (or excel file)

I hope it's clear what I mean, if not, let me know!

回答1:

It appears you are asking two questions:

  1. Calculate sums of products based on groups from a separate list
  2. Write these results to an Excel file

As these are quite different problems, I will address the first and refer to references on the second.

import operator as op
import itertools as it
import functools as ft


A = [1, 1, 2, 3, 3, 3]
B = [0.5, 0.25, 0.99, 0.8, 0.7, 0.2]

groups = [list(g) for k, g in it.groupby(zip(A, B), op.itemgetter(0))]
groups
# [[(1, 0.5), (1, 0.25)], [(2, 0.99)], [(3, 0.8), (3, 0.7), (3, 0.2)]]

Here we zip the columns of data and group them according to list A. Now we are able to apply the appropriate operators to each iterable in groups.

[sum(ft.reduce(op.mul, i) for i in sub) for sub in groups]
# [0.75, 1.98, 5.1]

As for writing data to Excel, there are several Python libraries available, each with documentation on how to do so.