I have a Rails 3.0.9 application running both locally in my dev env and remotely on a heroku app. I have a method that imports a CSV file into a model, and this file can contain non-english characters, like °,á,é,í, etc (it's in spanish).
I am currently able to import the complete file (75k records) without any problems in my local dev (SQLite) database; but, when uploading the db to heroku with heroku db:push
, it fails with the error I'm posting in the title:
!!! Caught Server Exception
HTTP CODE: 500
Taps Server Error: PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xba
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
Apparently, Heroku has issues inserting the '°' character. (At the moment the file doesn't have any á,é,í, etc characters, but I suspect these might fail too.)
I have set in my application.rb
file the default encoding, as follows:
#.../application.rb
config.encoding = "utf-8"
What else can I do to set the 'client encoding' and solve this problem?
The numero sign,
º
, is 0xBA in ISO-8869-1 not UTF-8. So your CSV file is encoded with Latin-1 but you're trying to store it in your database as UTF-8 without fixing the encoding.You can try telling your CSV library that it is dealing with Latin-1 encoded text and maybe it will take care of converting to UTF-8. If that doesn't work, then you can do it yourself with Iconv:
You're not having trouble with SQLite because SQLite tends be very forgiving and it has a very loose type system. PostgreSQL, OTOH, tends to be rather strict and properly complains if you try to feed it invalid data. I'd recommend that you stop developing on top of SQLite if you're going to be deploying to Heroku and PostgreSQL, there are other differences that will cause problems (the behavior of GROUP BY and LIKE for example).