Removing empty tags from XML via XSLT

2019-01-23 16:35发布

I had an xml of the following pattern

<?xml version="1.0" encoding="UTF-8"?>
    <Person>
      <FirstName>Ahmed</FirstName>
      <MiddleName/>
      <LastName>Aboulnaga</LastName>
      <CompanyInfo>
        <CompanyName>IPN Web</CompanyName>
        <Title/>
    <Role></Role>
        <Department>
    </Department>
      </CompanyInfo>
    </Person>

I used the following xslt (got from forums) in my attempt to remove empty tags

 <xsl:template match="@*|node()">
<xsl:if test=". != '' or ./@* != ''">
  <xsl:copy>
  <xsl:copy-of select = "@*"/>
    <xsl:apply-templates />
  </xsl:copy>
</xsl:if>

The xslt used is successful in removing tags like

<Title/>
    <Role></Role>

...but fails when empty tags are on two lines, eg:

<Department>
    </Department>

Is there any fix for this?

标签: xslt
6条回答
疯言疯语
2楼-- · 2019-01-23 16:52
<xsl:template match="@*|node()">
  <xsl:if test="normalize-space(.) != '' or ./@* != ''">
    <xsl:copy>
       <xsl:copy-of select = "@*"/>
       <xsl:apply-templates/>
    </xsl:copy>
  </xsl:if>
</xsl:template>
查看更多
相关推荐>>
3楼-- · 2019-01-23 16:52

You can use the following xslt to remove empty tags/attributes:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

    <xsl:template match="node()">
        <xsl:if test="normalize-space(string(.)) != ''
                        or count(@*[normalize-space(string(.)) != '']) > 0
                        or count(descendant::*[normalize-space(string(.)) != '']) > 0
                        or count(descendant::*/@*[normalize-space(string(.)) != '']) > 0">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()" />
        </xsl:copy>
        </xsl:if>
    </xsl:template>

    <xsl:template match="@*">
        <xsl:if test="normalize-space(string(.)) != ''">
            <xsl:copy>
                <xsl:apply-templates select="@*" />
            </xsl:copy>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>
查看更多
Viruses.
4楼-- · 2019-01-23 16:54

Your question is underspecified. What does empty mean? Is <outer> empty here?

<outer><inner/></outer>

Anyway, here's one approach that might fit your bill:

<xsl:template match="*[not(.//@*) and not( normalize-space() )]" priority="3"/>

Note you might have to tweak the priority to fit your needs.

查看更多
老娘就宠你
5楼-- · 2019-01-23 16:55

This transformation doesn't need any conditional XSLT instructions at all and uses no explicit priorities:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match=
    "*[not(@*|*|comment()|processing-instruction()) 
     and normalize-space()=''
      ]"/>
</xsl:stylesheet>

When applied on the provided XML document:

<Person>
    <FirstName>Ahmed</FirstName>
    <MiddleName/>
    <LastName>Aboulnaga</LastName>
    <CompanyInfo>
        <CompanyName>IPN Web</CompanyName>
        <Title/>
        <Role></Role>
        <Department>
        </Department>
    </CompanyInfo>
</Person>

it produces the wanted, correct result:

<Person>
   <FirstName>Ahmed</FirstName>
   <LastName>Aboulnaga</LastName>
   <CompanyInfo>
      <CompanyName>IPN Web</CompanyName>
   </CompanyInfo>
</Person>
查看更多
乱世女痞
6楼-- · 2019-01-23 16:56

(..) Is there any fix for this?

The tag on two lines is not an empty tag. It is a tag containing spaces inside (like new lines and possibly some kind of white space characters). The XPath 1.0 function normalize-space() allows you to normalize the content of your tags by stripping unwanted new lines.

Once you have applied the function to the tag content you can then check for the empty string. A good way to do this is by applying the XPath 1.0 boolean() function to the tag content. If the content is a zero-length string its result will be false.

Finally you can embed everything slightly changing your identity transform. You do not need xsl:if instructions or any other additional template.

The final transform will look like this:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
             <xsl:apply-templates 
                  select="node()[boolean(normalize-space())]
                         |@*"/>
     </xsl:copy>
 </xsl:template>

</xsl:stylesheet>

Additional note

Your xsl:if instruction is currently checking also for empty attributes. In that way you are actually removing also non-empy tags with empty attributes. It does not sound like just "Removing empty tags". So be careful, or you question is missing some detail, or you are using unsafe code.

查看更多
该账号已被封号
7楼-- · 2019-01-23 17:06

From what I have found on the net, this is the most correct answer:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml"/>
    <xsl:template match="/">
        <xsl:apply-templates select="*"/>
    </xsl:template>
    <xsl:template match="*">
            <xsl:if test=".!=''">
                <xsl:copy>
                  <xsl:copy-of select="@*"/>
                  <xsl:apply-templates/>
                </xsl:copy>
            </xsl:if>
    </xsl:template>
</xsl:stylesheet>
查看更多
登录 后发表回答