How to decode ¥ in JSON

I am using python to parse a JSON file, I know it is because of this ¥,

that I got this error when I was using json.loads

UnicodeDecodeError: 'utf8' codec can't decode byte 0xa5 in position 106:
invalid start byte

But how do I get around it? Do I decode and encode again?

¥ is the Chinese currency sign, but I am not sure which code category it belongs to.

Thanks!

update:

====================

I think my question should be, If you see this symbol, how do you guess the encoding.

An answer to this question maybe:

If you see ¥, then "utf-8" won't work, try "latin-1" instead. Is this understanding correct?

标签： python json encode google-finance

2条回答

迷人小祖宗

2楼-- · 2019-09-09 12:55

The real answer is, in the general case, you cannot determine the encoding of an unknown piece of data.

Given context, such as English text, you can sometimes guess e.g. that c?rrupted has had "o" replaced by "?", but if you don't have that sort of context, you can't even tell which bytes are wrong.

For your specific example, you are asking it the wrong way around. If you see a yen sign, which encoding are you using to look at the data? If it's Latin-1, then you are looking at a byte value of 0xA5. This value can be looked up; you could be looking at any of v‎, ¥‎, ¸‎ , Ë‎, Í‎, Ñ‎, Ą‎, ą‎, ċ‎, Ĩ‎, Ľ‎, ź‎, Β‎, Ξ‎, ξ‎, Ѕ‎, Ц‎, е‎, Ґ‎, Ҙ‎, ح‎, ٪‎, ۴‎, ฅ‎, „‎, •‎, ₯‎, ╔‎, ﺄ‎, or a fragment out of a multi-byte encoding.

If the program or organization which produced the unknown data is available, you can talk to people and/or experiment with the software; but if an authoritative answer can't be found, you end up just guessing, or giving up.

There is a reason modern formats require a known encoding, and will reject input which clearly violates that.

0人赞添加讨论(0) 举报

Deceive 欺骗

3楼-- · 2019-09-09 12:55

The problem was solve by using the following code:

 json.loads(contents,encoding='latin1')

I was confused about the encoding, the source did not specify it clearly.

0人赞添加讨论(0) 举报

How to decode ¥ in JSON

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间