look at that:
import urllib
print urllib.urlencode(dict(bla='Ã'))
the output is
bla=%C3%BC
what I want is simple, I want the output in ascii instead of utf-8, so I need the output:
bla=%C3
if I try:
urllib.urlencode(dict(bla='Ã'.decode('iso-8859-1')))
doesn't work (all my python files are utf-8 encoded):
'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
In production, the input comes unicoded.
If your input is actually UTF-8 and you want iso-8859-1 as output (which is not ASCII) what you need is:
Have a look at unicode transliteration in python:
In your case:
This is a third party library, which can be easily installed via:
Package
unihandecode
isthen in
python
prints
A
.pretty well working asciification is this way:
thanks to all solutions. all of you converge to the very same point. I made a mess changing the right code
to
turn back to .encode('iso-8859-1') and it works.
That's not ASCII, which has no characters mapped above 0x80. You're talking about ISO-8859-1, or possibly code page 1252 (the Windows encoding based on it).
Well that depends on what encoding you've used to save the character
Ã
in the source, doesn't it? It sounds like your text editor has saved it as UTF-8. (That's a good thing, because locale-specific encodings like ISO-8859-1 need to go away ASAP.)Tell Python that the source file you've saved is in UTF-8 as per PEP 263:
Or, if you don't want that hassle, use a backslash escape:
Although, either way, a modern webapp should be using UTF-8 for its input rather than ISO-8859-1/cp1252.