why is django using ascii instead of utf-8?

2019-09-17 17:26发布

When adding a data (with non ascii characters) as administrator in my sqlite3 database I get the following error :

Exception Type: UnicodeEncodeError at /admin/Project/vin/add/
Exception Value: 'ascii' codec can't encode character u'\xe2' in position 2:        
ordinal not in range(128)

I can not really figure out what's wrong since it's written utf-8 in all the different settings... and ascii appears only in the error.

I got that as explanation

Django Version: 1.9.4
Python Version: 2.7.10
Installed Applications:
['django.contrib.admin',
 'django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.messages',
 'django.contrib.staticfiles',
 'Project']
Installed Middleware:
['django.middleware.security.SecurityMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.contrib.auth.middleware.SessionAuthenticationMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware',
 'django.middleware.clickjacking.XFrameOptionsMiddleware']



> Traceback:

> File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-pac
> kages/django/core/handlers/base.py" in get_response
>   149.                     response = self.process_exception_by_middleware(e, request)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/core/handlers/base.py"
> in get_response
>   147.                     response = wrapped_callback(request, *callback_args, **callback_kwargs)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/contrib/admin/options.py"
> in wrapper
>   541.                 return self.admin_site.admin_view(view)(*args, **kwargs)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/utils/decorators.py"
> in _wrapped_view
>   149.                     response = view_func(request, *args, **kwargs)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/views/decorators/cache.py"
> in _wrapped_view_func
>   57.         response = view_func(request, *args, **kwargs)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/contrib/admin/sites.py"
> in inner
>   244.             return view(request, *args, **kwargs)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/contrib/admin/options.py"
> in add_view
>   1437.         return self.changeform_view(request, None, form_url, extra_context)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/utils/decorators.py"
> in _wrapper
>   67.             return bound_func(*args, **kwargs)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/utils/decorators.py"
> in _wrapped_view
>   149.                     response = view_func(request, *args, **kwargs)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/utils/decorators.py"
> in bound_func
>   63.                 return func.__get__(self, type(self))(*args2, **kwargs2)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/utils/decorators.py"
> in inner
>   184.                     return func(*args, **kwargs)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/contrib/admin/options.py"
> in changeform_view
>   1382.                     self.log_addition(request, new_object, change_message)
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/contrib/admin/options.py"
> in log_addition
>   714.             object_repr=force_text(object),
> 
> File
> "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/django/utils/encoding.py"
> in force_text
>   80.                 s = six.text_type(bytes(s), encoding, errors)

vin model

class Vin (models.Model):
    nom_vin =models.CharField (max_length = 20)
    millesime = models.IntegerField() 
    quantity = models.FloatField()
    appelation = models.ForeignKey(Appelation)
    def __str__(self):
        return self.nom_vin

2条回答
叼着烟拽天下
2楼-- · 2019-09-17 17:42

Try:

class Vin (models.Model):
    nom_vin =models.CharField (max_length = 20)
    millesime = models.IntegerField() 
    quantity = models.FloatField()
    appelation = models.ForeignKey(Appelation)
    def __unicode__(self):                        #You have __str__
        return self.nom_vin
查看更多
Melony?
3楼-- · 2019-09-17 17:48

I'm not nearly close to understanding the coding rigmarole in Python, but in trying to decode this error I got a little closer to it. Skip to the end if you just want the right solution (hint: python_2_unicode_compatible).

According to Porting to Python 3, in Python 2 you can have both __str__() (which must return a bytes string/object (type 'str')) and __unicode__() (which returns an 'unicode' one); but it should be enough if you have only one of them:

The print statement and the str built-in call __str__() to determine the human-readable representation of an object. The unicode built-in calls __unicode__() if it exists, and otherwise falls back to __str__() and decodes the result with the system encoding. Conversely, the Model base class automatically derives __str__() from __unicode__() by encoding to UTF-8.

Python 3, on the other hand, has unicode as the default for literals, and only needs __str__(), providing an unicode string, type 'str' (yep, same name as P2's bytes strings). Weeeeell, there's also __bytes__(), but you'll probably never need it.

That's all fine and dandy, until along comes encoding.py and its force_text method, which the admin interface uses:

def force_text(s, encoding='utf-8', strings_only=False, errors='strict'):
"""
Returns a text object representing 's' -- unicode on Python 2 and str on
Python 3. Treats bytestrings using the 'encoding' codec.
"""
[...]
try:
        if not issubclass(type(s), six.string_types):
            if six.PY3:
                if isinstance(s, bytes):
                    s = six.text_type(s, encoding, errors)
                else:
                    s = six.text_type(s)
            elif hasattr(s, '__unicode__'):
                s = six.text_type(s)
            else:
                s = six.text_type(bytes(s), encoding, errors)

The error is in the last line, and we can only trigger it with Python 2 (since Python 3 would trigger if six.PY3), when there's no __unicode__ method, and we're using unicode_literals (which encoding.py does), so bytes() (which is just an alias for str()) gets passed an unicode object, when it expects a bytes one:

 $ python2
Python 2.7.11+ (default, Mar 30 2016, 21:00:42) 
# Python 2 uses bytes (type "str")  strings by default:
>>> bstr = 'â'
>>> type(bstr)
<type 'str'>
>>> bstr
'\xc3\xa2'
>>> bytes(bstr)
'\xc3\xa2'

# unicode_literals changes the default to unicode strings:
>>> from __future__ import unicode_literals
>>> ustr = 'â'
>>> type(ustr)
<type 'unicode'>
>>> ustr
u'\xe2'
# bytes() (str(), actually) expects a byte string, not unicode:
>>> bytes(ustr)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe2' in position 0: ordinal not in range(128) '
# We can encode ustr to bytes like so:
>>> bytes(ustr.encode('utf-8'))
'\xc3\xa2'
# Or with the equivalent b operator, for literals:
>>> bytes(b'â')
'\xc3\xa2'

# bstr has not changed:
>>> bytes(bstr)
'\xc3\xa2'

For completeness sake, Python 3 has the default literal type as unicode, but also calls it 'str' (and byte strings are 'bytes'):

$ python3
Python 3.5.1+ (default, Jan 13 2016, 15:09:18) 
>>> ustr='á'
>>> ustr
'á'
>>> type(ustr)
<class 'str'>

>>> bstr='á'.encode('utf-8')
>>> bstr
b'\xc3\xa1'
>>> type(bstr)
<class 'bytes'>

# Note that you can't use `b` to enconde a non-ascii literal     
>>> bstr=b'á'
  File "<stdin>", line 1
SyntaxError: bytes can only contain ASCII literal characters.

Now, seems you're running Python 2.7, but using 3.5 libraries! No idea how did you manage that, but on following Volodymyr's suggestion, you've made your code compatible with Python 2, but broken with Python 3, which needs __str__().

Defining both __str__ and __unicode__ to return self.name in whichever type it's been cast seems to work for both, but it's bound to break sometime, as happened to you, since people don't bother checking the string type (which is a bit puzzling to me). You could have a __str__ that checks for Python version and string type, and passes or encodes the value accordingly, but the Django guys already took care of that:

The Python 2 and 3 compatibility library that Django adapts—Six (that's 2*3, get it?)—provides python_2_unicode_compatible, a class decorator that

aliases the __str__ method to __unicode__ and creates a new __str__ method that returns the result of __unicode__() encoded with UTF-8.

This is used in Part 2 of the tutorial (docs.djangoproject.com/en/1.9/intro/tutorial02/#playing-with-the-api — sorry, can't make more links):

from django.db import models
from django.utils.encoding import python_2_unicode_compatible

@python_2_unicode_compatible  # only if you need to support Python 2
class Question(models.Model):
    # ...
    def __str__(self):
        return self.question_text

Phew. And with all that, and Ned Batchelder's Pragmatic Unicode (nedbatchelder.com/text/unipain.html), I think I'm starting to get the hang of it. Still, I might just stick to Python 3... or PHP... ^_^

查看更多
登录 后发表回答