Any one familiar with the RubyGem Sanitize, that provide an example of building a "Transformer" to convert
"<ul><li>a</li><li>b</li><li>c</li></ul>"
into
"a,b, and c"
?
Any one familiar with the RubyGem Sanitize, that provide an example of building a "Transformer" to convert
"<ul><li>a</li><li>b</li><li>c</li></ul>"
into
"a,b, and c"
?
IMO transformers are not for pulling out data like this:
This is not what you're trying to do; you're trying to pull data out of nodes, and transform it. In your example, you're not doing the same thing to each element: you're sometimes appending a comma, sometimes appending a comma and the word "and".
In order to do that, you either need to save state and post-process, or look ahead in the node stream to see if you're visiting the last node. I don't know of a trivial way to do that with Sanitize's transformers, so this example saves state and post-processes.
IMO this example is an abuse of transformers because it's being run only for its side effect, it does nothing other than look for text nodes.
If one of the list items has embedded HTML, the naive approach no longer works, and you need to start knowing more Nokogiri anyway:
This approach relies on the default Sanitize behavior of nothing being whitelisted. The
<b>
tags are still visited by thesave_li
lambda, but they're stripped. This has a potential to cause issues under a variety of circumstances.