It seems like Nokogiri has a problem with UTF-8 conversion of the nbsp character. I've gathered this is an issue related to LibXML2. Nokogiri recommends upgrading LibXML2 to 2.7.7 instead of 2.7.6 that's running on Heroku.
Anyone know how I can use LibXML2 2.7.7 (or higher) on Heroku?
The problem is as follows --
doc = Nokogiri::HTML("<html><p>Hi Hello</p></html>")
doc.inner_html
=> "<html><body><p>Hi Hello</p></body></html>"
doc.inner_html = "<p>Hello World</p>"
=> "<p>Hello World</p>"
doc.inner_html
=> "<p>Hello World</p>"
Looks like this is related: https://github.com/sparklemotion/nokogiri/issues/306
This doesn't happen on my local machine. Rails has 'utf-8' set as the config.encoding
and the page that's rendered has a utf-8 charset meta tag.
On my local machine I'm running Nokogiri 1.6 with limxml2 2.8.0 and on Heroku I'm running Nokogiri 1.6 with libxml2 2.7.6.
Thanks.