可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm using mongodb and redis, redis is my cache.

I'm caching mongodb objects with redis-py:

obj in mongodb: {u'name': u'match', u'section_title': u'\u6d3b\u52a8', u'title': 
u'\u6bd4\u8d5b', u'section_id': 1, u'_id': ObjectId('4fb1ed859b10ed2041000001'), u'id': 1}

the obj fetched from redis with hgetall(key, obj) is:

{'name': 'match', 'title': '\xe6\xaf\x94\xe8\xb5\x9b', 'section_title': 
'\xe6\xb4\xbb\xe5\x8a\xa8', 'section_id': '1', '_id': '4fb1ed859b10ed2041000001', 'id': '1'}

As you can see, obj fetched from cache is str instead of unicode, so in my app, there is error s like :'ascii' codec can't decode byte 0xe6 in position 12: ordinal not in range(128)

Can anyone give some suggestions? thank u

回答1:

Update, for global setting, check jmoz's answer.

If you're using third-party lib such as django-redis, you may need to specify a customized ConnectionFactory:

class DecodeConnectionFactory(redis_cache.pool.ConnectionFactory):
    def get_connection(self, params):
        params['decode_responses'] = True
        return super(DecodeConnectionFactory, self).get_connection(self, params)

Assuming you're using redis-py, you'd better to pass str instead of unicode to Redis, or else Redis will encode it automatically for *set commands, normally in UTF-8. For the *get commands, Redis has no idea about the formal type of a value and has to just return the value in str directly.

Thus, As Denis said, the way that you storing the object to Redis is critical. You need to transform the value to str to make the Redis layer transparent for you.

Also, set the default encoding to UTF-8 instead of using ascii

回答2:

I think I've discovered the problem. After reading this, I had to explicitly decode from redis which is a pain, but works.

I stumbled across a blog post where the author's output was all unicode strings which was obv different to mine.

Looking into the StrictRedis.__init__ there is a parameter decode_responses which by default is False. https://github.com/andymccurdy/redis-py/blob/273a47e299a499ed0053b8b90966dc2124504983/redis/client.py#L446

Pass in decode_responses=True on construct and for me this FIXES THE OP'S ISSUE.

回答3:

for each string you can use the decode function to transform it in utf-8, e.g. for the value if the title field in your code:

In [7]: a='\xe6\xaf\x94\xe8\xb5\x9b'

In [8]: a.decode('utf8')
Out[8]: u'\u6bd4\u8d5b'

回答4:

I suggest you always encode to utf-8 before writing to MongoDB or Redis (or any external system). And that you decode('utf-8') when you fecth results, so that you always work with Unicode in Python.