I'm trying to get a Python 3 program to do some manipulations with a text file filled with information. However, when trying to read the file I get the following error:
Traceback (most recent call last): File "SCRIPT LOCATION", line NUMBER, in text = file.read() File "C:\Python31\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 2907500: character maps to
If anyone could give me any help to try and get past this problem I would be most grateful.
Just to add in case
file = open(filename, encoding="utf8")
does not work tryfile = open(filename, errors='ignore')
All IS WELL
The file in question is not using the
CP1252
encoding. It's using another encoding. Which one you have to figure out yourself. Common ones areLatin-1
andUTF-8
. Since 0x90 doesn't actually mean anything inLatin-1
,UTF-8
(where 0x90 is a continuation byte) is more likely.You specify the encoding when you open the file:
As an extension to @LennartRegebro answer:
If you can't tell what encoding it is and solution above does not work (it's not
utf8
) and you found yourself merely guessing - there are online tools that you could use to identify what encoding that is. They aren't perfect but usually work just fine. After you figured out encoding you should be able to use solution above.EDIT: (Copied from comment)
A quite popular text editor
Sublime Text
has a command to display encoding if it has been set...View
->Show Console
(or Ctrl+`)view.encoding()
and hope for the best (I was unable to get anything butUndefined
but maybe you will have better luck...)