I'm trying to raise exception in python 2.7.x which includes a unicode in the message. I can't seem to make it work.
Is it not supported or not recommended to include unicode in error msg? Or do i need to be looking at sys.stderr?
# -*- coding: utf-8 -*-
class MyException(Exception):
def __init__(self, value):
self.value = value
def __str__(self):
return self.value
def __repr__(self):
return self.value
def __unicode__(self):
return self.value
desc = u'something bad with field \u4443'
try:
raise MyException(desc)
except MyException as e:
print(u'Inside try block : ' + unicode(e))
# here is what i wish to make work
raise MyException(desc)
Running script produces the output below. Inside my try/except i can print the string without problem.
My problem is outside the try/except.
Inside try block : something bad with field 䑃
Traceback (most recent call last):
File "C:\Python27\lib\bdb.py", line 387, in run
exec cmd in globals, locals
File "C:\Users\ghis3080\r.py", line 25, in <module>
raise MyException(desc)
MyException: something bad with field \u4443
Thanks in advance.
This is how Python works. I believe what you are seeing is coming from
traceback._some_string()
in the Python core library. In that module, when a stack trace is done, the code in that method first tries to convert the message usingstr()
, then if that raises an exception, converts the message usingunicode()
, then converts it to ascii usingencode("ascii", "backslashreplace")
. You are getting valid output, and everything is working correctly, my guess is that Python is doing it's best to pseudo-down convert the error message so that it will display without problems no matter the platform executing it. That is just the unicode codepoint for your character. It doesn't happen in yourtry/except
block because this conversion is something specific to the mechanism that produces stack traces (such as in the event of uncaught exceptions).The behaviour depends on Python version and the environment. On Python 3 the character encoding error handler for
sys.stderr
is always'backslashreplace'
:python3:
python2:
That is on my system the error message is eaten on python2.
Note: on Windows you could try:
For comparison:
In my case your example worked as it should, printing nice unicode.
But sometimes you have a lot of problems with exception stack printed without (or with escaped/backslashed) unicode characters. It is possible to overcome the obstacle and print normal messages.
Example of the problem with output (Python 2.7, linux):
It will print only truncated or screwed message:
To actually see the unaltered unicode, you can encode it to raw bytes and feed into exception object:
This time you will see the full message:
You can do
value.encode('utf-8', 'replace')
in your constructor, if you like, but with system exception you will have to do it in theraise
statement, like in the example.The hint is taken from here: Overcoming frustration: Correctly using unicode in python2 (there are big library with many helpers, and all of them can be stripped down to the example above).