Converting dict values into a set while preserving

2019-05-15 14:11发布

问题:

I have a dict like this:

(100002: 'APPLE', 100004: 'BANANA', 100005: 'CARROT')

I am trying to make my dict have ints for the keys (as it does now) but have sets for the values (rather than strings as it is now.) My goal is to be able to read from a .csv file with one column for the key (an int which is the item id number) and then columns for things like size, shape, and color. I want to add this information into my dict so that only the information for keys already in dict are added.

My goal dict might look like this:

(100002: set(['APPLE','MEDIUM','ROUND','RED']), 100004: set(['Banana','MEDIUM','LONG','YELLOW']), 100005: set(['CARROT','MEDIUM','LONG','ORANGE'])

Starting with my dict of just key + string for item name, I tried code like this to read the extra information in from a .csv file:

infile = open('FileWithTheData.csv', 'r')
for line in infile.readlines():
    spl_line = line.split(',')
    if int(spl_line[0]) in MyDict.keys():
        MyDict[int(spl_line[0])].update(spl_line[1:])

Unfortunately this errors out saying AttributeError: 'str' object has no attribute 'update'. My attempts to change my dictionary's values into sets so that I can then .update them have yielded things like this: (100002: set(['A','P','L','E']), 100004: set(['B','A','N']), 100005: set(['C','A','R','O','T'])) I want to convert the values to a set so that the string that is currently the value will be the first string in the set rather than breaking up the string into letters and making a set of those letters.

I also tried making the values a set when I create the dict by zipping two lists together but it didn't seem to make any difference. Something like this MyDict = dict(zip(listofkeys, set(listofnames))) still makes the whole listofnames list into a set but it doesn't achieve my goal of making each value in MyDict into a set with the corresponding string from listofnames as the first string in the set.

How can I make the values in MyDict into a set so that I can add additional strings to that set without turning the string that is currently the value in the dict into a set of individual letters?

EDIT: I currently make MyDict by using one function to generate a list of item ids (which are the keys) and another function which looks up those item ids to generate a list of corresponding item names (using a two column .csv file as the data source) and then I zip them together.

ANSWER: Using the suggestions here I came up with this solution. I found that the section that has set()).update can easily be changed to list()).append to yield a list rather than a set (so that the order is preserved.) I also found it easier to update by .csv data input files by adding the column containing names to the FileWithTheData.csv so that I didn't have to mess with making the dict, converting the values to sets, and then adding in more data. My code for this section now looks like this:

MyDict = {}
infile = open('FileWithTheData.csv', 'r')
for line in infile.readlines():
    spl_line = line.split(',')
    if int(spl_line[0]) in itemidlist: #note that this is the list I was formerly zipping together with a corresponding list of names to make my dict
        MyDict.setdefault(int(spl_line[0]), list()).append(spl_line[1:])
print MyDict

回答1:

Your error is because originally your MyDict variable maps an integer to a string. When you are trying to update it you are treating the value like a set, when it is a string.

You can use a defaultdict for this:

combined_dict = defaultdict(set)

# first add all the values from MyDict
for key, value in MyDict.iteritems():
    combined_dict[int(key)].add(value)

# then add the values from the file
infile = open('FileWithTheData.csv', 'r')
for line in infile.readlines():
    spl_line = line.split(',')
    combined_dict[int(sp_line[0])].update(spl_line[1:])


回答2:

Your issue is with how you are initializing MyDict, try changing it to the following:

MyDict = dict(zip(listofkeys, [set([name]) for name in listofnames]))

Here is a quick example of the difference:

>>> listofkeys = [100002, 100004, 100005]
>>> listofnames = ['APPLE', 'BANANA', 'CARROT']
>>> dict(zip(listofkeys, set(listofnames)))
{100002: 'CARROT', 100004: 'APPLE', 100005: 'BANANA'}
>>> dict(zip(listofkeys, [set([name]) for name in listofnames]))
{100002: set(['APPLE']), 100004: set(['BANANA']), 100005: set(['CARROT'])}

set(listofnames) is just going to turn your list into a set, and the only effect that might have is to reorder the values as seen above. You actually want to take each string value in your list, and convert it to a one-element set, which is what the list comprehension does.

After you make this change, your current code should work fine, although you can just do the contains check directly on the dictionary instead of explicitly checking the keys (key in MyDict is the same as key in MyDict.keys()).