disclaimer, I've already done a long research to solve that alone but most of the questions I found here concern Python 2.7 or doesn't solve my problem
Let's say I've the following (That example comes from BeautifulSoup doc, I'm trying to solve a bigger issue):
>>> markup = "<h1>Sacr\xc3\xa9 bleu!</h1>"
>>> print(markup)
'Sacré bleu!'
For me, markup should be assigned to a bytes, so I could do:
>>> markup = b"<h1>Sacr\xc3\xa9 bleu!</h1>"
>>> print(str(markup, 'utf-8'))
<h1>Sacré bleu!</h1>
Yeah ! but how do I do that transition between "<h1>Sacr\xc3\xa9 bleu!</h1>"
which is wrong into b"<h1>Sacr\xc3\xa9 bleu!</h1>"
?
Because if I do:
>>> markup = b"<h1>Sacr\xc3\xa9 bleu!</h1>"
>>> bytes(markup, "utf-8")
b'<h1>Sacr\xc3\x83\xc2\xa9 bleu!</h1>'
You see? It inserted \x83\xc2
for free.
>>> print(bytes(markup))
TypeError: string argument without an encoding