When to use utf8 as a header in py files

2019-03-15 02:51发布

Some source files, from downloaded code, have the following header

# -*- coding: utf-8 -*-

I have an idea what utf-8 encoding is but why would it be needed as a header in a python source file?

标签: python utf-8
4条回答
霸刀☆藐视天下
2楼-- · 2019-03-15 03:08

wherever you need to use in your code chars that aren't from ascii, like:

ă 

interpreter will complain that he doesn't understand that char.

Usually this happens when you define constants.

Example: Add into x.py

print 'ă'

then start a python console

import x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "x.py", line 1
 SyntaxError: Non-ASCII character '\xc4' in file x.py on line 1, 
   but no encoding declared;
   see http://www.python.org/peps/pep-0263.html for details
查看更多
Bombasti
3楼-- · 2019-03-15 03:15

Always use UTF-8 and make sure your editor also uses UTF-8. Start your Python script like this if you use Python 27:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import unicode_literals

This is a good blog post from Nick Johnson about Python and UTF-8:

http://blog.notdot.net/2010/07/Getting-unicode-right-in-Python By the way, this post was written before he could use:

from __future__ import unicode_literals
查看更多
爱情/是我丢掉的垃圾
4楼-- · 2019-03-15 03:15

When you use non-ascii characters. For instance when I comment my source in norwegian if charachters ØÆÅ occur in the .py it will complain and not "compile".

查看更多
兄弟一词,经得起流年.
5楼-- · 2019-03-15 03:17

Whenever text is read or written, encodings come in play. Always. A python interpreter has to read your file as text, to understand it. The only situation where you could get away without having to deal with encodings is when you only use characters in the ASCII range. The interpreter can in this case use virtually any encoding in the world, and get it right because almost all encodings encode these characters to same bytes.

You should not use coding: utf-8 just because you have characters beyond ascii in your file, it can even be harmful. It is a hint for the python interpreter, to tell it what encoding your file is in. Unless you have configured your text editor, the text editor will most likely not save your files in utf-8. So now the hint you gave to the python interpreter, is wrong.

So you should use it when your file is encoded in utf-8. If it's encoded in windows-1252, you should use coding: windows-1252 and so on.

查看更多
登录 后发表回答