I have to parse a German date in MONTH YEAR format where MONTH is the full name of the month. I set the appropriate locale in Python and then try to parse the date with strptime
. For example:
locale.setlocale(locale.LC_ALL, "deu_deu") # Locale name on Windows
datetime.strptime(dt, "%B %Y")
On encountering a month with a non-ASCII character in its name I get a UnicodeEncodeError
. The date is being pulled from an XML file delivered via a web service. Is there a way I can transform my date string to that it works with strptime
?
EDIT
datetime.strptime(dt.encode("iso-8859-16"), "%B %Y")
worked.
No answer, just a test (on Unix, though):
The above works as expected. Now simulate unicode as input - März contains a LATIN SMALL LETTER A WITH DIAERESIS:
The same can be achieved with the built-in unicode function:
Now try with appropriate encoding:
Again, this is not on Windows - so not really an answer, but it may contain a hint.
Just to investigate a bit more - a scenario, where one deals with an data external source (using JSON for this example, YMMV for XML):
I think a proper JSON encoder will give you unicode, and RFC4627 seems to hint to that:
So to simulate that with python (nobody would parse JSON that way, this is just a simulation):
The following code solves the problem.