Is there any library or pre-written code to remove css attributes from HTML code.
The requirement is, the Java code has to parse through the input html document, and remove the css attributes and produce the output html document.
For example if the input html document has this element,
<p class="abc" style="xyz" > some text </p>
the output should be
<p > some text </p>
Use jsoup and NodeTraversor to remove class and style attributes from all elements
You could use Cyberneko to parse the document and add a simple filter that looks something like this: