Adding a single character to add keys in Counter

2019-07-09 02:23发布

If the type of a Counter object's keys is str, i.e.:

I could do this:

>>> vocab_counter = Counter("the lazy fox jumps over the brown dog".split())

>>> vocab_counter  = Counter({k+u"\uE000":v for k,v in vocab_counter.items()})
>>> vocab_counter
Counter({'brown\ue000': 1,
         'dog\ue000': 1,
         'fox\ue000': 1,
         'jumps\ue000': 1,
         'lazy\ue000': 1,
         'over\ue000': 1,
         'the\ue000': 2})

What would be a quick and/or pythonic way to add a character to all keys?

Is the above method the only way to achieve the final counter with the character appended to all keys? Are there other way(s) to achieve the same goal?

4条回答
祖国的老花朵
2楼-- · 2019-07-09 02:46

The better way would be adding that character before creating your counter object. You can do it using a generator expression within Counter:

In [15]: vocab_counter = Counter(w + u"\uE000" for w in "the lazy fox jumps over the brown dog".split())

In [16]: vocab_counter
Out[16]: Counter({'the\ue000': 2, 'fox\ue000': 1, 'dog\ue000': 1, 'jumps\ue000': 1, 'lazy\ue000': 1, 'over\ue000': 1, 'brown\ue000': 1})

If it's not possible to modify the words before creating the Counter you can override the Counter object in order to add the special character during setting the values for keys.

查看更多
爷、活的狠高调
3楼-- · 2019-07-09 03:00

The only other optimised way I can think of is to use a subclass of Counter that appends the character when the key is inserted:

from collections import Counter


class CustomCounter(Counter):
    def __setitem__(self, key, value):
        if len(key) > 1 and not key.endswith(u"\uE000"):
            key += u"\uE000"
        super(CustomCounter, self).__setitem__(key, self.get(key, 0) + value)

Demo:

>>> CustomCounter("the lazy fox jumps over the brown dog".split())
CustomCounter({u'the\ue000': 2, u'fox\ue000': 1, u'brown\ue000': 1, u'jumps\ue000': 1, u'dog\ue000': 1, u'over\ue000': 1, u'lazy\ue000': 1})
# With both args and kwargs 
>>> CustomCounter("the lazy fox jumps over the brown dog".split(), **{'the': 1, 'fox': 3})
CustomCounter({u'fox\ue000': 4, u'the\ue000': 3, u'brown\ue000': 1, u'jumps\ue000': 1, u'dog\ue000': 1, u'over\ue000': 1, u'lazy\ue000': 1})
查看更多
该账号已被封号
4楼-- · 2019-07-09 03:01

Shortest way i used is,

vocab_counter = Counter("the lazy fox jumps over the brown dog".split()) 
for key in vocab_counter.keys():
  vocab_counter[key+u"\uE000"] = vocab_counter.pop(key)
查看更多
闹够了就滚
5楼-- · 2019-07-09 03:12

You could do it with string manipulations:

text = 'the lazy fox jumps over the brown dog'
Counter((text + ' ').replace(' ', '_abc ').strip().split())
查看更多
登录 后发表回答