EDIT: Edited typos; the key values of the dictionary should be dictionaries, not sets.
I will keep the typos here though, as the questions below address this question. My apologies for the confusion.
Here's the problem:
Let's say I have a list of integers whereby are never repeats:
list1 = [2, 3]
In this case, there is a unique pair 2-3 and 3-2, so the dictionary should be:
{2:{3: 1}, 3:{2: 1}}
That is, there is 1 pair of 2-3 and 1 pair of 3-2.
For larger lists, the pairing is the same, e.g.
list2 = [2, 3, 4]
has the dicitonary
{2:{3: 1}, 3:{2: 1}, 3:{4: 1}, 4:{3: 1}, 2:{4: 1}, 4:{2: 1}}
(1) Once the size of the lists become far larger, how would one algorithmically find the "unique pairs" in this format using python data structures?
(2) I mentioned that the lists cannot have repeat integers, e.g.
[2, 2, 3]
is impossible, as there are two 2s.
However, one may have a list of lists:
list3 = [[2, 3], [2, 3, 4]]
whereby the dictionary must be
{2:{3: 2}, 3:{2: 2}, 3:{4: 1}, 4:{3: 1}, 2:{4: 1}, 4:{2: 1}}
as there are two pairs of 2-3 and 3-2. How would one "update" the dictionary given multiple lists within a list?
This is an algorithmic problem, and I don't know of the most efficient solution. My idea would be to somehow cache values in a list and enumerate pairs...but that would be so slow. I'm guessing there's something useful from itertools
.
What you want is to count pairs that arise from combinations in your lists. You can find those with a Counter
and combinations
.
from itertools import combinations
from collections import Counter
list2 = [2, 3, 4]
count = Counter(combinations(list2, 2))
print(count)
Output
Counter({(2, 3): 1, (2, 4): 1, (3, 4): 1})
As for your list of list, we update the Counter
with the result from each sublist.
from itertools import combinations
from collections import Counter
list3 = [[2, 3], [2, 3, 4]]
count = Counter()
for sublist in list3:
count.update(Counter(combinations(sublist, 2)))
print(count)
Output
Counter({(2, 3): 2, (2, 4): 1, (3, 4): 1})
My approach iterates over the input dict
(linear complexity) and pairs each key with its first available integer (this complexity depends on the exact specs of your question - e.g., can each list contain unlimited sub-lists?), inserting these into an output dict (constant complexity).
import os
import sys
def update_results(result_map, tup):
# Update dict inplace
# Don't need to keep count here
try:
result_map[tup] += 1
except KeyError:
result_map[tup] = 1
return
def algo(input):
# Use dict to keep count of unique pairs while iterating
# over each (key, v[i]) pair where v[i] is an integer in
# list input[key]
result_map = dict()
for key, val in input.items():
key_pairs = list()
if isinstance(val, list):
for x in val:
if isinstance(x, list):
for y in x:
update_results(result_map, (key, y))
else:
update_results(result_map, (key, x))
else:
update_results(result_map, (key, val))
return len(result_map.keys())
>>> input = { 1: [1, 2], 2: [1, 2, [2, 3]] }
>>> algo(input)
>>> 5
I'm pretty sure there's a more refined way to do this (again, would depend on the exact specs of your question), but this could get your started (no imports)