I am trying to decode some HTML entities, such as \'<\'
becoming \'<\'
.
I have an old gem (html_helpers) but it seems to have been abandoned twice.
Any recommendations? I will need to use it in a model.
I am trying to decode some HTML entities, such as \'&lt;\'
becoming \'<\'
.
I have an old gem (html_helpers) but it seems to have been abandoned twice.
Any recommendations? I will need to use it in a model.
HTMLEntities can do it:
: jmglov@laurana; sudo gem install htmlentities
Successfully installed htmlentities-4.2.4
: jmglov@laurana; irb
irb(main):001:0> require \'htmlentities\'
=> []
irb(main):002:0> HTMLEntities.new.decode \"¡I'm highly annoyed with character references!\"
=> \"¡I\'m highly annoyed with character references!\"
To encode the characters, you can use CGI.escapeHTML
:
string = CGI.escapeHTML(\'test \"escaping\" <characters>\')
To decode them, there is CGI.unescapeHTML
:
CGI.unescapeHTML(\"test "unescaping" <characters>\")
Of course, before that you need to include the CGI library:
require \'cgi\'
And if you\'re in Rails, you don\'t need to use CGI to encode the string. There\'s the h
method.
<%= h \'escaping <html>\' %>
To decode characters in Rails use:
<%= raw \'<html>\' %>
So,
<%= raw \'<br>\' %>
would output
<br>
I think Nokogiri gem is also a good choice. It is very stable and has a huge contributing community.
Samples:
a = Nokogiri::HTML.parse \"foo bär\"
a.text
=> \"foo bär\"
or
a = Nokogiri::HTML.parse \"¡I'm highly annoyed with character references!\"
a.text
=> \"¡I\'m highly annoyed with character references!\"
If you don\'t want to add a new dependency just to do this (like HTMLEntities
) and you\'re already using Hpricot
, it can both escape and unescape for you. It handles much more than CGI
:
Hpricot.uxs \"foo bär\"
=> \"foo bär\"
You can use htmlascii
gem:
Htmlascii.convert string
<% str=\"<h1> Test </h1>\" %>
result: < h1 > Test < /h1 >
<%= CGI.unescapeHTML(str).html_safe %>