It seems like Nokogiri has a problem with UTF-8 conversion of the nbsp character. I've gathered this is an issue related to LibXML2. Nokogiri recommends upgrading LibXML2 to 2.7.7 instead of 2.7.6 that's running on Heroku.
Anyone know how I can use LibXML2 2.7.7 (or higher) on Heroku?
The problem is as follows --
doc = Nokogiri::HTML("<html><p>Hi Hello</p></html>")
doc.inner_html
=> "<html><body><p>Hi Hello</p></body></html>"
doc.inner_html = "<p>Hello World</p>"
=> "<p>Hello World</p>"
doc.inner_html
=> "<p>Hello World</p>"
Looks like this is related: https://github.com/sparklemotion/nokogiri/issues/306
This doesn't happen on my local machine. Rails has 'utf-8' set as the config.encoding
and the page that's rendered has a utf-8 charset meta tag.
On my local machine I'm running Nokogiri 1.6 with limxml2 2.8.0 and on Heroku I'm running Nokogiri 1.6 with libxml2 2.7.6.
Thanks.
Unfortunately Heroku doesn't support installing additional libraries or binaries to stacks. The best workaround is to vendor these into your project. You'll need to use 64-bit Linux versions to make them work on Heroku; compiling statically can also help ensure that any dependencies needed are included. Similarly, for gems that depend on external libraries, we recommend compiling the gem statically and vendoring it into your project.
If you do wish to try to vendor your binary, library, or gem, you can use Heroku as your build environment. One of Herokus engineers created a build server that allows you to upload source code, run the compilation step, and then download the resulting binary. You can find this project on Github under the name "Vulcan".
Heres a link for more instructions... https://devcenter.heroku.com/articles/buildpack-binaries