UnicodeDecodeError: 'charmap' codec can

2018-12-31 17:49发布

问题:

I\'m trying to get a Python 3 program to do some manipulations with a text file filled with information. However, when trying to read the file I get the following error:

Traceback (most recent call last): File \"SCRIPT LOCATION\", line NUMBER, in text = file.read() File \"C:\\Python31\\lib\\encodings\\cp1252.py\", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: \'charmap\' codec can\'t decode byte 0x90 in position 2907500: character maps to

If anyone could give me any help to try and get past this problem I would be most grateful.

回答1:

The file in question is not using the CP1252 encoding. It\'s using another encoding. Which one you have to figure out yourself. Common ones are Latin-1 and UTF-8. Since 0x90 doesn\'t actually mean anything in Latin-1, UTF-8 (where 0x90 is a continuation byte) is more likely.

You specify the encoding when you open the file:

file = open(filename, encoding=\"utf8\")


回答2:

As an extension to @LennartRegebro answer:

If you can\'t tell what encoding it is and solution above does not work (it\'s not utf8) and you found yourself merely guessing - there are online tools that you could use to identify what encoding that is. They aren\'t perfect but usually work just fine. After you figured out encoding you should be able to use solution above.

EDIT: (Copied from comment)

A quite popular text editor Sublime Text has a command to display encoding if it has been set...

  1. Go to View -> Show Console (or Ctrl+`)

\"enter

  1. Type into field at the bottom view.encoding() and hope for the best (I was unable to get anything but Undefined but maybe you will have better luck...)

\"enter



回答3:

Just to add in case file = open(filename, encoding=\"utf8\") does not work try file = open(filename, errors=\'ignore\')

All IS WELL