I have the following Python 2.7 dictionary data structure (I do not control source data - comes from another system as is):
{112762853378: {'dst': ['10.121.4.136'], 'src': ['1.2.3.4'], 'alias': ['www.example.com'] }, 112762853385: {'dst': ['10.121.4.136'], 'src': ['1.2.3.4'], 'alias': ['www.example.com'] }, 112760496444: {'dst': ['10.121.4.136'], 'src': ['1.2.3.4'] }, 112760496502: {'dst': ['10.122.195.34'], 'src': ['4.3.2.1'] }, 112765083670: ... }
The dictionary keys will always be unique. Dst, src, and alias can be duplicates. All records will always have a dst and src but not every record will necessarily have an alias as seen in the third record.
In the sample data either of the first two records would be removed (doesn't matter to me which one). The third record would be considered unique since although dst and src are the same it is missing alias.
My goal is to remove all records where the dst, src, and alias have all been duplicated - regardless of the key.
How does this rookie accomplish this?
Also, my limited understanding of Python interprets the data structure as a dictionary with the values stored in dictionaries... a dict of dicts, is this correct?
One simple approach would be to create a reverse dictionary using the concatenation of the string data in each inner dictionary as a key. So say you have the above data in a dictionary,
d
:If you don't want a list of duplicates or anything like that, but just want to create a duplicate-less dict, you could just use a regular dictionary instead of a
defaultdict
and re-reverse it like so:You can use
to solve your problem.
result
The facts that this solution creates first a list input_raw.iteritems() (as in Andrew's Cox's answer) and requires a growing list seen are drawbacks.
But the first can't be avoided (using iteritems() doesn't work) and the second is less heavy than re-creating a list result.values() from growing list result for each turn of a loop.
Another reverse dict variation:
You could go though each of the items (the key value pair) in the dictionary and add them into a result dictionary if the value was not already in the result dictionary.