I am using RMySQL to connect to an aws MySQL server. It works, except character values are deformed. This question has been asked before but the fixes don't seem to work for me. Here's what I'm doing:
Make sure no connections are open:
dbListConnections(MySQL()) list()
Make sure my connection is set to use UTF-8:
dbGetQuery(credentials, "show variables like 'character_set%'")
Variable_name Value
1 character_set_client utf8
2 character_set_connection utf8
3 character_set_database utf8
4 character_set_filesystem utf8
5 character_set_results utf8
6 character_set_server utf8
7 character_set_system utf8
8 character_sets_dir /rdsdbbin/mysql-5.5.40.R1/share/charsets/
Fetch data:
data <- dbGetQuery(credentials, Query) head(data) keyword_ja 1 \036 2 \036蜀ャ 3 \036螟\x8f 4 \036譌・譛ャ莠コ 5 \037繧、繝ゥ繧ケ繝\x88 6 \037蜿守ゥォ
When I write this data to disk Excel shows the same deformed characters, but notepad++ can somehow show the japanese as it's intended:
"keyword_ja"
"冬" "夏" "日本人" "イラスト" "収穫"
I've been trying to use functions like Encoding() and enc2utf8() in R to get it to display the characters correctly as notepad++ does, with no success.
Encoding(head(data$keyword_ja))
[1] "unknown" "unknown" "unknown" "unknown" "unknown" "unknown"
enc2utf8(head(data$keyword_ja))
[1] "\036" "\036蜀ャ" "\036螟<8f>" "\036譌・譛ャ莠コ" "\037繧、繝ゥ繧ケ繝<88>" "\037蜿守ゥォ"
I can normally type japanese characters and R has no problem displaying them
Sys.getlocale() [1] "LC_COLLATE=Japanese_Japan.932;LC_CTYPE=Japanese_Japan.932;LC_MONETARY=Japanese_Japan.932;LC_NUMERIC=C;LC_TIME=Japanese_Japan.932" mystring <- "日本語入力できる" mystring [1] "日本語入力できる" Encoding(mystring) [1] "unknown"
I'm pretty desperate to figure this out so any help is very much appreciated. Please let me know if I can provide additional information.