how to extract the child element from nested tags

2019-03-06 13:53发布

问题:

I have a scenario in xml :

    <body>
        <div><i>italic</i>
            <div id ="88">
                <div id="4545">
                    <h3>hey h3</h3>
                    <xyz>XYZ</xyz>
                </div>
            </div>
        </div>
        <div  id="123">
            <h1>Example</h1>
                <div  id="1234">
                    <h1>heading 1</h1>
                    <p>computer</p>
                    <div>
                        <i>italic 2</i>
                        <div>
                            <h3>heading 3</h3>
                        </div>
                    </div>    
                </div>
            <div  id="12345">
                <h1>heading 1</h1>
            </div>
        </div>
    </body>

I need to apply the rule that div converted to section and the div in which h1 value is Example ,delete that h1 tag and add attribute class=<value of that h1> to section tag .

expected output:
    <body>
        <section>
            <i>italic<i>
        </section>
        <section class="hey h3">

             <xyz>XYZ</xyz>
        </section>
        <section class="example">
            <title>heading 1</title>
            <p>computer</p>
        </section>
        <section>
            <i>italic 2</i>
        </section>
        <section>
        <h3>heading 3</h3>
        </section>
        <section >
            <title>heading 1</title>
        </section>
    </body>

my xslt:
<xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="div[h1='Example']">
        <xsl:apply-templates select="node()[not(self::h1)]"/>
    </xsl:template>

    <xsl:template match="div/h1">
        <title>
            <xsl:apply-templates/>
        </title>
    </xsl:template>

    <xsl:template match="div[h3='hey h3']">
        <xsl:apply-templates select="node()[not(self::h3)]"/>
    </xsl:template>

    <xsl:template match="div/h3">
        <title>
            <xsl:apply-templates/>
        </title>
    </xsl:template>

    <xsl:template match="div[not(h1='Example')]">
        <section>
            <xsl:if test="preceding-sibling::*[1][self::h1[.='Example']]">
                <xsl:attribute name="class">example</xsl:attribute>
            </xsl:if>
            <xsl:apply-templates select="node()[not(self::div)]"/>
        </section>
        <xsl:apply-templates select="node()[self::div]"/>
    </xsl:template>

    <xsl:template match="div[not(h3='hey h3')]">
        <section>
            <xsl:if test="preceding-sibling::*[1][self::h3[.='hey h3']]">
                <xsl:attribute name="class">richi rich</xsl:attribute>
            </xsl:if>
            <xsl:apply-templates select="node()[not(self::div)]"/>
        </section>
        <xsl:apply-templates select="node()[self::div]"/>
    </xsl:template>



actual output:
    <body>
        <section>
            <i>italic</i>
        </section>
        <section/>
        <section>
            <title>hey h3</title>
            <xyz>XYZ</xyz>
        </section>
        <section>
            <title>Example</title>
        </section>
        <section>
            <title>heading 1</title>
            <p>computer</p>
        </section>
        <section>
            <i>italic 2</i>
        </section>
        <section>
            <title>heading 3</title>
        </section>
        <section>
            <title>heading 1</title>
        </section>
    </body>

Actually there are two scenarios : 1. nested section should not be present and 2. the condition that the which is having as "heading 1" then delete that tag from the div and add the attribute to the div with the value of tag .

Could you please suggest what should I do here to get the expected output.

回答1:

start with an identity template:

<xsl:template match="node()|@*">
    <xsl:copy>
        <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
</xsl:template>

next, match the div node which contains the target h1 node

<xsl:template match="div[h1='Example']">
    <!-- apply child nodes except h1 -->
    <xsl:apply-templates select="node()[not(self::h1)]"/>
</xsl:template>

and a template for div nodes which do not contain the target h1 node

<xsl:template match="div[not(h1='Example')]">
    <section>
        <!-- set the attribute if the immediate preceding-sibling node is h1 -->
        <xsl:if test="preceding-sibling::*[1][self::h1[.='Example']]">
            <xsl:attribute name="class">example</xsl:attribute>
        </xsl:if>
        <xsl:apply-templates select="node()[not(self::div)]"/>
    </section>
    <xsl:apply-templates select="node()[self::div]"/>
</xsl:template>

and a template for the h1 node

<xsl:template match="div/h1">
    <title>
        <xsl:apply-templates/>
    </title>
</xsl:template>

The whole stylesheet is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">

    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="div[h1='Example']">
        <xsl:apply-templates select="node()[not(self::h1)]"/>
    </xsl:template>

    <xsl:template match="div/h1">
        <title>
            <xsl:apply-templates/>
        </title>
    </xsl:template>

    <xsl:template match="div[not(h1='Example')]">
        <section>
            <xsl:if test="preceding-sibling::*[1][self::h1[.='Example']]">
                <xsl:attribute name="class">example</xsl:attribute>
            </xsl:if>
            <xsl:apply-templates select="node()[not(self::div)]"/>
        </section>
        <xsl:apply-templates select="node()[self::div]"/>
    </xsl:template>

</xsl:stylesheet>

see it in action (https://xsltfiddle.liberty-development.net/pPzifpb/3)