Use cases for the 'setdefault' dict method

2019-01-03 00:52发布

The addition of collections.defaultdict in Python 2.5 greatly reduced the need for dict's setdefault method. This question is for our collective education:

  1. What is setdefault still useful for, today in Python 2.6/2.7?
  2. What popular use cases of setdefault were superseded with collections.defaultdict?

16条回答
放荡不羁爱自由
2楼-- · 2019-01-03 01:27

As most answers state setdefault or defaultdict would let you set a default value when a key doesn't exist. However, I would like to point out a small caveat with regard to the use cases of setdefault. When the Python interpreter executes setdefaultit will always evaluate the second argument to the function even if the key exists in the dictionary. For example:

In: d = {1:5, 2:6}

In: d
Out: {1: 5, 2: 6}

In: d.setdefault(2, 0)
Out: 6

In: d.setdefault(2, print('test'))
test
Out: 6

As you can see, print was also executed even though 2 already existed in the dictionary. This becomes particularly important if you are planning to use setdefault for example for an optimization like memoization. If you add a recursive function call as the second argument to setdefault, you wouldn't get any performance out of it as Python would always be calling the function recursively.

查看更多
疯言疯语
3楼-- · 2019-01-03 01:32

One very important use-case I just stumbled across: dict.setdefault() is great for multi-threaded code when you only want a single canonical object (as opposed to multiple objects that happen to be equal).

For example, the (Int)Flag Enum in Python 3.6.0 has a bug: if multiple threads are competing for a composite (Int)Flag member, there may end up being more than one:

from enum import IntFlag, auto
import threading

class TestFlag(IntFlag):
    one = auto()
    two = auto()
    three = auto()
    four = auto()
    five = auto()
    six = auto()
    seven = auto()
    eight = auto()

    def __eq__(self, other):
        return self is other

    def __hash__(self):
        return hash(self.value)

seen = set()

class cycle_enum(threading.Thread):
    def run(self):
        for i in range(256):
            seen.add(TestFlag(i))

threads = []
for i in range(8):
    threads.append(cycle_enum())

for t in threads:
    t.start()

for t in threads:
    t.join()

len(seen)
# 272  (should be 256)

The solution is to use setdefault() as the last step of saving the computed composite member -- if another has already been saved then it is used instead of the new one, guaranteeing unique Enum members.

查看更多
戒情不戒烟
4楼-- · 2019-01-03 01:35

Another use case that I don't think was mentioned above. Sometimes you keep a cache dict of objects by their id where primary instance is in the cache and you want to set cache when missing.

return self.objects_by_id.setdefault(obj.id, obj)

That's useful when you always want to keep a single instance per distinct id no matter how you obtain an obj each time. For example when object attributes get updated in memory and saving to storage is deferred.

查看更多
可以哭但决不认输i
5楼-- · 2019-01-03 01:35

I rewrote the accepted answer and facile it for the newbies.

#break it down and understand it intuitively.
new = {}
for (key, value) in data:
    if key not in new:
        new[key] = [] # this is core of setdefault equals to new.setdefault(key, [])
        new[key].append(value)
    else:
        new[key].append(value)


# easy with setdefault
new = {}
for (key, value) in data:
    group = new.setdefault(key, []) # it is new[key] = []
    group.append(value)



# even simpler with defaultdict
new = defaultdict(list)
for (key, value) in data:
    new[key].append(value) # all keys have a default value of empty list []

Additionally,I categorized the methods as reference:

dict_methods_11 = {
            'views':['keys', 'values', 'items'],
            'add':['update','setdefault'],
            'remove':['pop', 'popitem','clear'],
            'retrieve':['get',],
            'copy':['copy','fromkeys'],}
查看更多
登录 后发表回答