XML transform for escape sequences

2019-09-06 09:25发布

I have the following XML. I want to convert this using a XSL-FO to add a line break where there is <escape V=".br"/> and to bold the text between <escape V="H"/> and <escape V="N"/>. Any suggestions? Even a XML->html example would suffice.

<OBX.5>
                Tan<escape V=".br"/>MB BS FRACS PhD<escape
                        V=".br"/>Hospital &amp; Specialist Centre<escape
                        V=".br"/>Address:<escape
                        V=".br"/>XX XX Street<escape
                        V=".br"/>XXX<escape
                        V=".br"/>HUTT 5011<escape
                        V=".br"/>Date of Birth:<escape
                        V=".br"/>15.03.1987<escape
                        V=".br"/>Telephone:<escape
                        V=".br"/>(h) 9888 26846<escape
                        V=".br"/>(m) 0221 632 4590<escape
                        V=".br"/>HI Number:<escape
                        V=".br"/>JAP5065<escape V=".br"/>
    <escape V=".br"/>
                    After the 5 dots is a Bolded Line.....<escape
                        V="H"/> TEST Escape Characters<escape V="N"/>TEST more
    <escape V=".br"/>
</OBX.5>

标签: xml xslt
1条回答
【Aperson】
2楼-- · 2019-09-06 10:26

This would work, but explicitly not for nested formatting (i.e. bold + italic):

<xsl:template match="OBX.5">
  <fo:block linefeed-treatment="preserve">
    <xsl:apply-templates select="node()[
      count(preceding-sibling::escape[@V = 'H']) 
      - count(preceding-sibling::escape[@V = 'N'])
      = 0
    ]" />
  </fo:block>
</xsl:template>

<!-- add a line break where there is <escape V=".br"/> -->
<xsl:template match="escape[@V = '.br']">
  <xsl:text>&#xA;</xsl:text>
</xsl:template>

<!-- bold the text between <escape V="H"/> and <escape V="N"/> -->
<xsl:template match="escape[@V = 'H']">
  <fo:inline font-weight="bold">
    <xsl:apply-templates select="
      following-sibling::escape[@V = 'N'][1]/preceding-sibling::node()[
        generate-id(preceding-sibling::escape[@V = 'H'][1]) = generate-id(current())
      ]
    " />
  </fo:inline>
</xsl:template>

<!-- all other escapes are ignored -->
<xsl:template match="escape" />

<!-- trim all text nodes before output (optional, remove if unnecessary) -->
<xsl:template match="text()">
  <xsl:value-of select="normalize-space()" />
</xsl:template>

Output for your sample:

<block xmlns="http://www.w3.org/1999/XSL/Format" linefeed-treatment="preserve">Tan
MB BS FRACS PhD
Hospital &amp; Specialist Centre
Address:
XX XX Street
XXX
HUTT 5011
Date of Birth:
15.03.1987
Telephone:
(h) 9888 26846
(m) 0221 632 4590
HI Number:
JAP5065

After the 5 dots is a Bolded Line.....<inline font-weight="bold">TEST Escape Characters</inline>TEST more
</block>

Note that only escape[@V = '.br'] actually create a line break, so "After the 5 dots is a Bolded Line....." cannot be true for your input.


The XPath expressions I use take some explaining, so here it is.

Imagine the children of <OBX.5> as this list:

 # Node                                     H   N  H-N 
-------------------------------------------------------
 1 Tan                                      0   0   0
 2 <escape V=".br"/>                        0   0   0
 3 MB BS FRACS PhD                          0   0   0
   ... and so on ...                        .   .   .
29 <escape V=".br"/>                        0   0   0
30 After the 5 dots is a Bolded Line.....   0   0   0
31 <escape V="H"/>                          0   0   0
32 TEST Escape Characters                   1   0   1
33 <escape V="N"/>                          1   0   1
34 TEST more                                1   1   0
35 <escape V=".br"/>                        1   1   0

where

  • H is count(preceding-sibling::escape[@V = 'H'])
  • N is count(preceding-sibling::escape[@V = 'N'])
  • and H-N is the difference of the two, obviously.

So with

<xsl:apply-templates select="node()[
  count(preceding-sibling::escape[@V = 'H']) 
  - count(preceding-sibling::escape[@V = 'N'])
  = 0
]" />

we work on all nodes except #32 and #33.

Template #3 gives special treatment to #31, rendering a bold container which contains:

<xsl:apply-templates select="
  following-sibling::escape[@V = 'N'][1]/preceding-sibling::node()[
    generate-id(preceding-sibling::escape[@V = 'H'][1]) = generate-id(current())
  ]
" />

where the XPath translates to

  • go to the first following escape[@V = 'N']
  • from there, look backwards (preceding-sibling) and select all nodes where
  • the ID of the first preceding escape[@V = 'H'] is the same ID as the current escape[@V = 'H'].

This condition only applies to #32 and #33 in your example, effectively it slices the list of nodes in a way that prevents rendering more text as bold than required.

Node #33 is discarded by template #4, we only need it for counting purposes.

查看更多
登录 后发表回答