Removing extra blank lines with XSLT, without usin

2019-09-10 14:59发布

I'm transforming an XML document for InDesign import, which has resulted in a number of blank lines in the output file. These blank lines will translate into empty paragraphs in InDesign so must be moved. I've tried using <xsl:strip-space elements="*"/> but that removes ALL line breaks (resulting in a single paragraph in InDesign). Setting the xsl:output to indent fixes this problem, but the indents carry over during InDesign import and also screw up certain mixed content nodes that should be nested on a single line.

To illustrate, my output file should remove empty lines only, with no indents:

<wrapper>
<h1>Header</h1>
<para>text <italic>text</italic> text></p>
<blockquote>
<p>here is a block quote please don't indent me</p>
</blockquote>
<para>more text more text yay</para>
<wrapper>

Any ideas? I've played a little with normalize-space and translate commands to replace line breaks (&#xA;) but I haven't had any luck. I'm a total XSLT amateur so I appreciate any help.

EDIT I've added additional details and sample code. Please be kind, I am an amateur at best!

Below is a sample XML file before transformation.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE book-part-wrapper PUBLIC "-//NLM//DTD BITS Book Interchange DTD v1.0 20131225//EN" "http://jats.nlm.nih.gov/extensions/bits/1.0/BITS-book1.dtd">
<book-part-wrapper dtd-version="1.0" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<book-meta>
</book-meta>
<book-part book-part-type="chapter" seq="10">
<book-part-meta>
<title-group>
<title content-type="bookpart-title">Test Title</title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Smith</surname>
<given-names>John</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Jones</surname>
<given-names>Jane</given-names>
</name>
</contrib>
</contrib-group>
</book-part-meta>
<body>
<sec>
<title content-type="head-a">Header 1</title>
<p content-type="p-first"> Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed felis sem, suscipit at sodales eget, faucibus dignissim tellus. Curabitur dictum pulvinar lectus, sit amet ornare ex ultricies vel.<xref ref-type="fn" rid="fn1">1</xref> Suspendisse turpis sem, blandit ut elit eu, pharetra vehicula sem. Suspendisse tincidunt enim at magna auctor lobortis. Aenean egestas ligula purus, non vulputate ipsum porttitor sed. Quisque a maximus magna, eget pellentesque odio. Vivamus porttitor massa ut posuere euismod. Donec vehicula mi non libero dapibus semper. Fusce nec felis vel nulla auctor volutpat. Fusce pellentesque pellentesque nunc ac blandit. Fusce sed erat feugiat massa blandit vehicula.</p> 
</sec>
</body>
<back>
<fn-group>
<title content-type="head-notes">Notes</title>
<fn id="fn1"><label>1</label><p>Jeff White, &#8220;<italic>Book Title</italic>.&#8221; N.D. Accessed January 27, 2013</p></fn>
</fn-group>
</back>
</book-part>
</book-part-wrapper>

Below is my XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="3.0"    
    xmlns:aid="http://ns.adobe.com/AdobeInDesign/4.0/" 
    xmlns:aid5="http://ns.adobe.com/AdobeInDesign/5.0/" 
    xmlns:math="http://www.w3.org/2005/xpath-functions/math" 
    xmlns:mml="http://www.w3.org/1998/Math/MathML" 
    xmlns:xlink="http://www.w3.org/1999/xlink" 
    xmlns:xi="http://www.w3.org/2001/XInclude"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output encoding="UTF-8" indent="no" method="xml" omit-xml-declaration="no" />        


    <!-- Identity transform -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:apply-templates select="book-part"/>
        </xsl:copy>
    </xsl:template>                       

    <!-- Skip these nodes -->
    <xsl:template match="book-meta|book-part-id"/>    

    <xsl:template match="book-part">
        <xsl:copy>                        
            <xsl:apply-templates select="@*"/>            
            <xsl:apply-templates select=".//book-part-id[@book-part-id-type='doi']"/>
            <xsl:apply-templates select=".//title[@content-type='bookpart-title']"/>            
            <body>                
                <xsl:apply-templates select=".//title[@content-type='bookpart-title']"/>                
                <xsl:apply-templates select=".//body"/>
                <xsl:apply-templates select=".//fn-group"/>
            </body>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="body|label|p|title|fn|italic">                
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:apply-templates select="node()"/>        
        </xsl:copy>
    </xsl:template>

    <!-- Apply all child nodes; don't copy the element itself -->
    <xsl:template match="book-part-meta|title-group|disp-quote|sec|contrib-group|fn-group">
        <xsl:apply-templates/>
    </xsl:template>

</xsl:stylesheet>

Here is my output:

<?xml version="1.0" encoding="UTF-8"?><book-part-wrapper xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xi="http://www.w3.org/2001/XInclude" dtd-version="1.0" xml:lang="en"><book-part book-part-type="chapter" seq="10"><title content-type="bookpart-title">Test Title</title><body xmlns:aid="http://ns.adobe.com/AdobeInDesign/4.0/" xmlns:aid5="http://ns.adobe.com/AdobeInDesign/5.0/" xmlns:math="http://www.w3.org/2005/xpath-functions/math" xmlns:xs="http://www.w3.org/2001/XMLSchema"><title content-type="bookpart-title">Test Title</title><body>

<title content-type="head-a">Header 1</title>
<p content-type="p-first"> Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed felis sem, suscipit at sodales eget, faucibus dignissim tellus. Curabitur dictum pulvinar lectus, sit amet ornare ex ultricies vel.<xref ref-type="fn" rid="fn1"/> Suspendisse turpis sem, blandit ut elit eu, pharetra vehicula sem. Suspendisse tincidunt enim at magna auctor lobortis. Aenean egestas ligula purus, non vulputate ipsum porttitor sed. Quisque a maximus magna, eget pellentesque odio. Vivamus porttitor massa ut posuere euismod. Donec vehicula mi non libero dapibus semper. Fusce nec felis vel nulla auctor volutpat. Fusce pellentesque pellentesque nunc ac blandit. Fusce sed erat feugiat massa blandit vehicula.</p> 

</body>
<title content-type="head-notes">Notes</title>
<fn id="fn1"><label>1</label><p>Jeff White, <italic>Book Title</italic>. N.D. Accessed January 27, 2013</p></fn>
</body></book-part></book-part-wrapper>

If I add <xsl:strip-space elements="*"/> to the XSLT file, my output is returned all on a single line:

<?xml version="1.0" encoding="UTF-8"?><book-part-wrapper xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xi="http://www.w3.org/2001/XInclude" dtd-version="1.0" xml:lang="en"><book-part book-part-type="chapter" seq="10"><title content-type="bookpart-title">Test Title</title><body xmlns:aid="http://ns.adobe.com/AdobeInDesign/4.0/" xmlns:aid5="http://ns.adobe.com/AdobeInDesign/5.0/" xmlns:math="http://www.w3.org/2005/xpath-functions/math" xmlns:xs="http://www.w3.org/2001/XMLSchema"><title content-type="bookpart-title">Test Title</title><body><title content-type="head-a">Header 1</title><p content-type="p-first"> Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed felis sem, suscipit at sodales eget, faucibus dignissim tellus. Curabitur dictum pulvinar lectus, sit amet ornare ex ultricies vel.<xref ref-type="fn" rid="fn1"/> Suspendisse turpis sem, blandit ut elit eu, pharetra vehicula sem. Suspendisse tincidunt enim at magna auctor lobortis. Aenean egestas ligula purus, non vulputate ipsum porttitor sed. Quisque a maximus magna, eget pellentesque odio. Vivamus porttitor massa ut posuere euismod. Donec vehicula mi non libero dapibus semper. Fusce nec felis vel nulla auctor volutpat. Fusce pellentesque pellentesque nunc ac blandit. Fusce sed erat feugiat massa blandit vehicula.</p></body><title content-type="head-notes">Notes</title><fn id="fn1"><label>1</label><p>Jeff White, <italic>Book Title</italic>. N.D. Accessed January 27, 2013</p></fn></body></book-part></book-part-wrapper>

标签: xml xslt
0条回答
登录 后发表回答