Encoding error with polish charset during transfer

2019-03-06 07:56发布

问题:

I am trying to transfer one of my databases from one host (home.pl) to another (my newly set server). The script that I am trying to transfer is wordpress. Unluckily irrespective of the method used I am struggling with encoding problems.

New host configuration

In my new server I am using the following directives in my.cnf:

[mysql]
default-character-set=utf8

[mysqld]
collation-server = utf8_general_ci
character-set-server = utf8
init_connect='SET collation_connection = utf8_general_ci' 
init_connect='SET NAMES utf8' 

[client]
default-character-set=utf8

My mySQL vars:

character_set_client    utf8
character_set_connection    utf8
character_set_database  utf8
character_set_filesystem    binary
character_set_results   utf8
character_set_server    utf8
character_set_system    utf8
collation_connection    utf8_general_ci
collation_database  utf8_general_ci
collation_server    utf8_general_ci

Php.ini on new server:

; PHP's default character set is set to UTF-8.
; http://php.net/default-charset
default_charset = "UTF-8"

Old host configuration

I have runned SHOW VARIABLES in my old host from which I am trying to transfer database and I got the following:

character_set_client    utf8
character_set_connection    utf8mb4
character_set_database  utf8
character_set_results   utf8
character_set_server    latin2
character_set_system    utf8
/usr/local/pssql55/share/charsets/
collation_connection    utf8mb4_general_ci
collation_database  utf8_polish_ci
collation_server    latin2_general_ci

Transfer methods tried out

1) Transfer via phpmyadmin

I have tried using PHPMYADMIN export/import. In particular I have pointed out UTF-8 as file character set both during export and import via phpmyadmin.

What is strange both in phpmyadmin on source server and new host I don't see polish chars (the output is the same without polish chars).

2) Export / Import via mysql dump

I have tried also to use:

 mysqldump -h OLD_HOST -u OLD_USER -p DB | mysql -h localhost -u root NEW DATABASE

but the encoding also fails.

Tried to use also encoding variables but it also failed:

 mysqldump --default-character-set=latin1  | mysql --default-character-set=utf8 

Dump file

In my dump file using Programers Notepad with UTF-8 encoding set, charcters look like this:

"Ä" instead of "ę"

Opening them in microsoft word I see

Ä™ instead of "ę"

The encoding converter (gżegżółka) recognises that the file is in: C:\Users\mkondej001\Desktop\14271425_mk.sql

Kodowanie: Unicode UTF-8
EOL: LF (Unix) 

Any clues how to transfer DB / set server variables correctly ?

回答1:

At the end I have founded out that the problem was related to the fact that the data was written to SQL incorrectly in my original server.

I ended up with transferring DB using:

mysqldump --default-character-set=utf8 [ORYGINAL_DB] | mysql [TARGET_DB] --default-character-set=utf8

and the executing:

UPDATE [table name] SET [field] = CONVERT(BINARY CONVERT([field] USING latin2) USING utf8)

as it was advices here:

strange character encoding of stored data , old script is showing them fine new one doesn't

Hope that the above solution will be helpful for others too.



回答2:

SET NAMES utf8;

(The default is latin11, which leads to Ä™.)

Note: init_connect is not executed for root (or any SUPER) user. So this failed you:

init_connect='SET NAMES utf8'