Java - Removing tag from xml

2020-05-10 08:09发布

问题:

I have an xml, as follows:

<Row ss:Index="76" ss:AutoFitHeight="0" ss:Height="25">
   <Cell ss:Index="1" ss:MergeAcross="9" ss:StyleID="s38">
      <ss:Data ss:Type="String" xmlns="http://www.w3.org/TR/REC-html40">
          <Font html:Size="15" html:Face="Times New Roman" x:Family="Roman" html:Color="#000000">
            <B> ABCD </B>
          </Font>
       </ss:Data>
   </Cell>
</Row>

Now, I want to remove the < B > tag, but retain the content, "ABCD" here. Or is there a way to remove the < B > from the whole XML file using java. Please help. Thanks.

回答1:

  1. Parse document with DOM4J or SAX Parser

  2. Get value from Font tag

<Font html:Size="15" html:Face="Times New Roman" x:Family="Roman" html:Color="#000000"> <B> ABCD </B> </Font>

  1. Remove all html tags from value

JSoup-way

Jsoup.parse(html).text();

String replaceAll-way:

replaceAll("\<(\/)?B\>","")
  1. Set new value to XML Font tag


标签: java xml xpath