Where to fix character encoding issues: before db

2019-05-31 01:25发布

问题:

I'm working web project with a mssql backend, not in my control except for tapping it. A lot of the data is littered with garbage characters in place of tradmarks, quotes, register symbols and quotes.

The html I'm rendering in is set to utf-8.

Is this encoding problem something that should be taken care of on the insertion of the data into the db?

回答1:

The problem should be taken care of where it's wrong. If you try to take care of it somewhere else, then you don't fix the problem, you only try to recover lost data.

When you handle encoded text, you have to do it correctly all the way. If you encode it or decode it incorrectly at one point in the process, you can't reliably fix it at any other point.

You have to find out where the text gets encoded or decoded incorrectly by examining what's happening to the data, and apply the fix there.