PyGame and Unicode - a neverending story

2020-04-20 04:05发布

问题:

What I do in my code:

1st I load a UTF-8 textfile (yes, double/triple/quadruple checked: it IS UTF-8) by using with codecs.open...

def load_verbslist(folder, name, utf_encoding):
    fullname = os.path.join("daten", folder, name)
    if utf_encoding:
        with codecs.open(fullname, "r", "utf-8-sig") as name:
            lines = name.readlines()
    else:
        name = open(fullname, "r")
        lines = name.readlines()
    for x in range(0, len(lines)):
        lines[x] = lines[x].strip("\n")
        lines[x] = lines[x].strip("\r")
    return lines

From that file come my solution strings. I later split the lines and encode everything again to blit it to the screen like this:

class BlittedText():
    def __init__(self, number, colour):
        self.number = number
        self.colour = colour
        if self.number == 0: #Infinitiv
            self.content = Solution.verb[0]
            self.content = self.content.encode("utf-8")
            self.text = Main.font1.render(str(self.content), 1, self.colour)
            self.pos = (45, 45)

Then I append a list named "Strings" by several of those BlittedText() classes.

Then I blit it to the screen:

for element in Strings:
    screen.blit(element, position)

The result can be seen in this picture: http://img341.imageshack.us/img341/6617/ee43.png In the Python shell (left side) everything shows correctly, my inputs (very left side) as the strings from the solution TXT file (which definitely, surely, 100% is saved as UTF-8). On the screen my input (black) blits correctly, while the solution strings (green and red) show weird characters instead of the unicode characters. I thought, I properly encoded them but obviously not :/

Does anyone find my mistake? Where's my thinking gone wrong?

Thank you very much already!

Pat

回答1:

You shouldn't encode the string to utf-8 before rendering. When you encode it to utf-8 you are generating a regular string with strange characters.

If you encode it to latin-1 you get rid of the strange characters:

self.content = self.content.encode("iso-8859-1")