How can I encode a string for HTML?

2020-03-01 04:14发布

I'm looking for a simple way to HTML encode a string/object in Perl. The fewer additional packages used the better.

3条回答
Rolldiameter
2楼-- · 2020-03-01 04:42

HTML::Entities is your friend here.

use HTML::Entities;
my $encoded = encode_entities( "foo & bar & <baz>" );
查看更多
仙女界的扛把子
3楼-- · 2020-03-01 04:42

When this question was first answered, HTML::Entities was the module most people probably used. It's pure Perl and by default will escape the HTML reserved characters ><'"& and wide characters.

Recently, HTML::Escape showed up. It has both XS and pure Perl. If you're using the XS version, it's about ten times faster than HTML::Entities. However, it only escapes ><'"& and has no way to change the defaults. Here's the difference with the XS version:

Benchmark: timing 10000 iterations of html_entities, html_escape...
html_entities: 14 wallclock secs (14.09 usr +  0.01 sys = 14.10 CPU) @ 709.22/s (n=10000)
html_escape:  1 wallclock secs ( 0.68 usr +  0.00 sys =  0.68 CPU) @ 14705.88/s (n=10000)

And here's the fair fight with pure Perl versions on each side:

Benchmark: timing 10000 iterations of html_entities, html_escape...
html_entities: 14 wallclock secs (13.79 usr +  0.01 sys = 13.80 CPU) @ 724.64/s (n=10000)
html_escape:  7 wallclock secs ( 7.57 usr +  0.01 sys =  7.58 CPU) @ 1319.26/s (n=10000)

You can get these benchmarks in Surveyor::Benchmark::HTMLEntities. I explain how I distribute benchmarks using Surveyor::App.

查看更多
仙女界的扛把子
4楼-- · 2020-03-01 04:50

Which do you need to encode, a string or an object? If it's just a string, then you should just have to worry about encoding issues such as UTF-8, and CGI::escape will probably do the trick for you. If it's an object, you'll need to serialize it first, which opens up a whole new set of issues, but you might want to consider JSON-encoding it.

PS. Although since I can't find any recent documentation on this method (it's actually imported from CGI::Util and is marked as "internal"), you should probably use escapeHTML() as daxim points out in his comment: http://search.cpan.org/perldoc?CGI#AUTOESCAPING_HTML

查看更多
登录 后发表回答