I need to insert a series of names (like 'Alam\xc3\xa9') into a list, and than I have to save them into a SQLite database.
I know that I can render these names correctly by tiping:
print eval(repr(NAME)).decode("utf-8")
But I have to insert them into a list, so I can't use the print
Other way for doing this without the print?
Lots and lots of misconceptions here.
The string you quote is not Unicode. It is a byte string, encoded in UTF-8.
You can convert it to Unicode by decoding it:
unicode_name = name.decode('utf-8')
When you print the value of unicode_name
to the console, you will see one of two things:
>>> unicode_name
u'Alam\xe9'
>>> print unicode_name
Alamé
Here, you can see that just typing the name and pressing enter shows a representation of the Unicode code points. This is the same as typing print repr(unicode_name)
. However, doing print unicode_name
prints the actual characters - ie behind the scenes, it encodes it to the correct encoding for your terminal, and prints the result.
But this is all irrelevant, because Unicode strings can only be represented internally. As soon as you want to store it in a database, or a file, or anywhere, you need to encode it. And the most likely encoding to choose is UTF-8 - which is what it was in originally.
>>> name
'Alam\xc3\xa9'
>>> print name
Alamé
As you can see, using the original non-decoded version of the name, repr
and print
once again show the codes and the characters. So it's not that converting it to Unicode actually makes it any more "really" the correct character.
So, what to do if you want to store it in a database? Nothing. Nothing at all. Sqlite accepts UTF-8 input, and stores its data in UTF-8 format on the disk. So there is absolutely no conversion needed to store the original value of name
in the database.
Are you looking for something like this?
[n.decode("utf-8") for n in ['Alam\xc3\xa9', 'Alam\xc3\xa9', 'Alam\xc3\xa9']]