I've this scenario.
A movie title:
$ title = "La leyenda de Osaín"
With this encoding:
$ title.encoding.name
>> UTF-8
I then saves it to the database.
$ movie = Movie.create!(:title => title)
Then I try to get the movie.
$ Movie.find(movie.id).title.encoding.name
>> "ASCII-8BIT"
$ Movie.find(movie.id).title
>> "La leyenda de Osa\xC3\xADn"
All other movies works that does not contain special characters like í
and û
.
This is my database.yaml file:
development:
adapter: mysql
database: development
username: linus
password: my_password
socket: /tmp/mysql.sock
encoding: UTF8
I'm getting the right sort of data when using forced_encoding
.
$ Movie.find(movie.id).title.force_encoding("UTF-8")
>> "La leyenda de Osaín"
I'm using Rails 3.0.5.rc1 with MySQL 14.14.
Anyone knows what the problem may be?
I found a solution to my problem.
Now I'm using the newer mysql2
gem.
I replaced gem "mysql"
with gem "mysql2"
inside the Gemfile.
Then I changed the database adapter inside the database.yaml file.
From:
development:
adapter: mysql
database: development
username: linus
password: my_password
socket: /tmp/mysql.sock
encoding: UTF8
To:
development:
adapter: mysql2
database: development
username: linus
password: my_password
socket: /tmp/mysql.sock
encoding: UTF8
I think this was the deal breaker in my case:
Taken from Github MySQL2
[...]It also forces the use of UTF-8 [or binary] for the connection [and all strings in 1.9[...]
According to this link, rails scaffolding creates varchar(255) columns in mysql. The mysql documentation says the following about varchar(255):
For example, a VARCHAR(255) column can
hold a string with a maximum length of
255 characters. Assuming that the
column uses the latin1 character set
(one byte per character), the actual
storage required is the length of the
string (L), plus one byte to record
the length of the string.
My guess is that the column type in the database doesn't support characters that are represented by more than one byte. This link has more information about common pitfalls in rails when dealing with unicode strings and more specifically, it says you need to create your database as utf8 like so:
CREATE_DATABASE my_web_two_zero_development DEFAULT CHARSET utf8;