Counting “unique pairs” of numbers into a python d

2019-08-23 06:53发布

问题:

EDIT: Edited typos; the key values of the dictionary should be dictionaries, not sets.

I will keep the typos here though, as the questions below address this question. My apologies for the confusion.

Here's the problem:

Let's say I have a list of integers whereby are never repeats:

list1 = [2, 3]   

In this case, there is a unique pair 2-3 and 3-2, so the dictionary should be:

{2:{3: 1}, 3:{2: 1}}

That is, there is 1 pair of 2-3 and 1 pair of 3-2.

For larger lists, the pairing is the same, e.g.

list2 = [2, 3, 4]

has the dicitonary

{2:{3: 1}, 3:{2: 1}, 3:{4: 1}, 4:{3: 1}, 2:{4: 1}, 4:{2: 1}}

(1) Once the size of the lists become far larger, how would one algorithmically find the "unique pairs" in this format using python data structures?

(2) I mentioned that the lists cannot have repeat integers, e.g.

[2, 2, 3]

is impossible, as there are two 2s.

However, one may have a list of lists:

list3 = [[2, 3], [2, 3, 4]] 

whereby the dictionary must be

{2:{3: 2}, 3:{2: 2}, 3:{4: 1}, 4:{3: 1}, 2:{4: 1}, 4:{2: 1}}

as there are two pairs of 2-3 and 3-2. How would one "update" the dictionary given multiple lists within a list?

This is an algorithmic problem, and I don't know of the most efficient solution. My idea would be to somehow cache values in a list and enumerate pairs...but that would be so slow. I'm guessing there's something useful from itertools.

回答1:

What you want is to count pairs that arise from combinations in your lists. You can find those with a Counter and combinations.

from itertools import combinations
from collections import Counter

list2 = [2, 3, 4]

count = Counter(combinations(list2, 2))

print(count)

Output

Counter({(2, 3): 1, (2, 4): 1, (3, 4): 1})

As for your list of list, we update the Counter with the result from each sublist.

from itertools import combinations
from collections import Counter

list3 = [[2, 3], [2, 3, 4]]

count = Counter()

for sublist in list3:
    count.update(Counter(combinations(sublist, 2)))

print(count)

Output

Counter({(2, 3): 2, (2, 4): 1, (3, 4): 1})


回答2:

My approach iterates over the input dict (linear complexity) and pairs each key with its first available integer (this complexity depends on the exact specs of your question - e.g., can each list contain unlimited sub-lists?), inserting these into an output dict (constant complexity).

import os 
import sys 


def update_results(result_map, tup):
    # Update dict inplace
    # Don't need to keep count here
    try:
        result_map[tup] += 1
    except KeyError:
        result_map[tup] = 1
    return


def algo(input):
    # Use dict to keep count of unique pairs while iterating
    # over each (key, v[i]) pair where v[i] is an integer in 
    # list input[key]
    result_map = dict()
    for key, val in input.items():
        key_pairs = list()
        if isinstance(val, list):
            for x in val:
                if isinstance(x, list):
                    for y in x:
                        update_results(result_map, (key, y))
                else:
                    update_results(result_map, (key, x))
        else:
            update_results(result_map, (key, val))
    return len(result_map.keys())


>>> input = { 1: [1, 2], 2: [1, 2, [2, 3]] }
>>> algo(input)
>>> 5

I'm pretty sure there's a more refined way to do this (again, would depend on the exact specs of your question), but this could get your started (no imports)