This is a little confusing so I will try my best to explain my goal. In a nutshell i'm trying to look at a sublist within a list. In those sublists, some have the same starting element (sublist[0]) and i want to record the differences between that sublist with other sublists starting with the same element
data = [['o1415', '1', '0', '1'], ['o1415', '0', '0', '0'], ['o1414', '0', '0', '0'], ['o1414', '1', '0', '0'], ['o1414', '0', '0', '0'], ['o1408', '0', '0', '1'], ['o1406', '0', '0', '0']]
D_changes = {}
here is a list with 4 elements . . the first of which has a name, 2nd/3rd/4th elements have digits .
i'm trying to generate a dictionary that has the {name:[then,the,differences])}
for example data[0] and data[1] both have 'o1415' as their first element . since they have the same string for the first element i want to compare the rest of the lists with each other . so data[0] differs in data[0][1] and data[0][2] from data[1] . . . so i want to add 'o1415':['first','third'] to the empty dictionary D_changes.
another example would be 'o1414' which is in data[2],data[3],data[4] and for these lists, one element is different in the [1] position so i'd like to add 'o1414' : ['first'] to the empty dictionary above
in the end i want to obtain a dictionary with this type of content
desired_changes = {'o1415':['first','third'],'o1414':['first'],'o1408':[],'o1406':[]}
I'll give you a direction more than a full answer.
First, load up a dict to group like items for further processing; I'll use a defaultdict
:
d = defaultdict(list)
data = [['o1415', '1', '0', '1'], ['o1415', '0', '0', '0'], ['o1414', '0', '0', '0'], ['o1414', '1', '0', '0'], ['o1414', '0', '0', '0'], ['o1408', '0', '0', '1'], ['o1406', '0', '0', '0']]
for sub in data:
d[sub[0]].append([int(x) for x in sub[1:]])
Then, for a given key, simply look at the zip
of its values. i.e. for 'o1414':
d['o1414']
Out[58]: [[0, 0, 0], [1, 0, 0], [0, 0, 0]]
list(zip(*d['o1414']))
Out[59]: [(0, 1, 0), (0, 0, 0), (0, 0, 0)]
We know if they're all equal if it's all 1, or all 0; otherwise it's different. So just do:
[any(x) and not all(x) for x in zip(*d['o1414'])]
Out[60]: [True, False, False]
I particularly like the aesthetics of that - any(x) and not all(x)
. Python can be beautiful sometimes.
Anyway, True
means that you have a differing value in that slot. I'll leave it up to you do do that for all your keys and to get it into the format that you want.
i figured it out. not sure that deserved a -1 vote . this probably isn't the most efficient way but it works
data = [['o1415', '1', '0', '1'], ['o1415', '0', '0', '0'], ['o1414', '0', '0', '0'], ['o1414', '1', '0', '0'], ['o1414', '0', '0', '0'], ['o1408', '0', '0', '1'], ['o1406', '0', '0', '0']]
D = {}
for name in data:
while name:
for k in data:
temp = []
if name[0] == k[0]:
if name[1] != k[1]:
temp.append('first')
if name[2] != k[2]:
temp.append('second')
if name[3] != k[3]:
temp.append('third')
for k in temp:
if len(k) != 0:
D[name[0]] = temp
break
else:
pass
break