I am using Python 2.7 to read data from a MySQL table. In MySQL the name looks like this:
Garasa, Ángel.
But when I print it in Python the output is
Garasa, �ngel
The character set name in MySQL is utf8. This is my Python code:
# coding: utf-8
import MySQLdb
connection = MySQLdb.connect
(host="localhost",user="root",passwd="root",db="jmdb")
cursor = connection.cursor ()
cursor.execute ("select * from actors where actorid=672462;")
data = cursor.fetchall ()
for row in data:
print "IMDB Name=",row[4]
wiki=("".join(row[4]))
print wiki
I have tried decoding it, but get error such as:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xc1 in position 8: invalid start byte
I have read about decoding and UTF-8 but couldn't find a solution.
I think the right character mapping in your case is
cp1252
:EDIT: It could also be possible that it is
latin-1
as well:As
cp1252
andlatin-1
code pages intersects for all codes except the range 128 to 159.Quoting from this source (
latin-1
):And this one (
cp1252
):Get the Mysql driver to return Unicode strings instead. This means that you don't have to deal with decoding in your code.
Simply set
use_unicode=True
in the connection parameters. If the table has been set with a specific encoding then set thecharset
attribute accordingly.