Using XSLT 2.0.
I need to filter all elements that have attribute @xml:lang
where the attribute's values are not in a list of possible values I define. Ex allowable values: x-default
,en
,en-US
,en-GB
When @xml:lang
is detected on any element, and if x-default
exists, then any sibling element of same type with @xml:lang
value other than x-default
should be compared to the element's text value of x-default
, and if same element text value, be removed.
To say that another way, any sibling duplicates of @xml:lang="x-default"
should be removed, based on the element's text value comparison.
Bonus points if it's possible to rank the order of duplicates, such that x-default
is always chosen (if exists), followed by a second tier (en
, fr
, ru
), followed by a third tier (en-EN
, en-GB
, fr-FR
, ru-RU
), where the second tier duplicates of the first tier are removed, and the third tier is compared to second tier (if exists), or else the first tier, so that the third tier is also removed if duplicate. This would need to be handled dynamically, as there are many possible languages.
A special case that should be also considered, is a situation where first tier (x-default
) has some value
, second tier (en
) has some valuation
, third tier (en-US
) has some value
. In this situation, there's no duplicate to remove, as the second tier exists and the third tier does not match it.
My current XSLT (doesn't attempt removing duplicates, as I've not found a sure-fire solution yet, and any attempts on my part have failed miserably). This is not my ideal XSLT, it's just the best I know to build currently, and it's able to filter down the data set. The programmer in me would like to see all the or
's changed to an array-value check so the values can be managed more cleanly, but I'm not sure if that's doable in XSLT:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xpath-default-namespace="http://some.namespace/uri">
<xsl:strip-space elements="*"/>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<!-- Select everything except comment/processing instruction nodes -->
<xsl:template match="attribute()|element()|text()">
<xsl:copy>
<xsl:apply-templates select="attribute()|element()|text()"/>
</xsl:copy>
</xsl:template>
<!-- Remove categories & assignments not in our whitelist -->
<xsl:template match="//*[@category-id and not(@category-id='root'
or @category-id='men' or @category-id='men_clothing' or @category-id='men_clothing_tshirts'
or @category-id='sales' or @category-id='sales_men' or @category-id='sales_men_tees'
or @category-id='sales' or @category-id='sales_women' or @category-id='sales_women_tanks-teeshirts'
or @category-id='clothing' or @category-id='clothing_teeshirts'
or @category-id='kids' or @category-id='kids_0816'
or @category-id='men' or @category-id='men_shoes' or @category-id='men_shoes_skate'
or @category-id='sales' or @category-id='sales_men' or @category-id='sales _men_shoes'
)]"/>
<!-- Remove locales not default or in whitelist -->
<xsl:template match="//*[@xml:lang and not(@xml:lang='x-default' or @xml:lang='en' or @xml:lang='en-US' or @xml:lang='en-CA' or @xml:lang='en-GB' or @xml:lang='fr' or @xml:lang='fr-FR' or @xml:lang='ru' or @xml:lang='ru-RU')]"/>
<!-- Remove empty nodes -->
<xsl:template match="*[not(normalize-space()) and not(.//@*)]"/>
</xsl:stylesheet>
Example dataset below. In a real dataset, there are many more elements, so again the logic to remove duplicate @xml:lang
entries must not be hard-coded to the XPath you might deduce, but rather operate on groups of same-type data, grouped adjacent.
<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns="http://some.namespace/uri" catalog-id="catalog-products">
<category category-id="sales_women">
<display-name xml:lang="x-default"><![CDATA[Women's Sales]]></display-name>
<display-name xml:lang="en"><![CDATA[Sales for Women]]></display-name>
<display-name xml:lang="en-US"><![CDATA[Women's Sales]]></display-name>
</category>
<product product-id="111111111">
<display-name xml:lang="x-default"><![CDATA[Aurora]]></display-name>
<display-name xml:lang="de"><![CDATA[Aurora]]></display-name>
<display-name xml:lang="en"><![CDATA[Aurora]]></display-name>
<display-name xml:lang="en-US"><![CDATA[Aurora Fleece]]></display-name>
<display-name xml:lang="es"><![CDATA[Aurora]]></display-name>
<display-name xml:lang="fr"><![CDATA[Aurora]]></display-name>
<display-name xml:lang="ru"><![CDATA[Aurora]]></display-name>
<short-description xml:lang="de"><![CDATA[Aurora - Fleece-Top für Damen]]></short-description>
<short-description xml:lang="en"><![CDATA[Aurora - Sweatshirt for women]]></short-description>
<short-description xml:lang="x-default"><![CDATA[Aurora - Sweatshirt for women]]></short-description>
<short-description xml:lang="en-US"><![CDATA[Snow Fleece & Softshells - Aurora Fleece]]></short-description>
<short-description xml:lang="es"><![CDATA[Aurora - Top polar de mujer]]></short-description>
<short-description xml:lang="fr"><![CDATA[Aurora - haut en polaire femme]]></short-description>
<short-description xml:lang="ru"><![CDATA[Свитшот SomeBrand для девушек]]></short-description>
<long-description xml:lang="de"><![CDATA[<p class="productLongDescriptionTitle"></p><p class="productLongDescriptionSubTitle">Composition</p><p>100 % Polyester</p>]]></long-description>
<long-description xml:lang="en"><![CDATA[<p class="productLongDescriptionTitle"></p><p class="productLongDescriptionSubTitle">Composition</p><p>100% Polyester</p>]]></long-description>
<long-description xml:lang="x-default"><![CDATA[<p class="productLongDescriptionTitle"><p class="productLongDescriptionSubTitle">Composition</p><p>100% Polyester</p>]]></long-description>
<long-description xml:lang="en-US"><![CDATA[<p class="productLongDescriptionTitle"></p><p>The stretchy, polar fleece Aurora zip-up shields you from the elements with street-savvy style to have you standing out on the slopes and the sidewalk. Designed with a tailored fit, tech details include zippered hand warmer pockets, a lyrca binding finish, a chest pocket, flatlock seams for smooth comfort, and ergonomic seams for support. Imported. 100% polyester polar fleece.</p><p class="productLongDescriptionSubTitle">Composition</p><p>100% Polyester
Polar Fleece</p>]]></long-description>
<long-description xml:lang="es"><![CDATA[<p class="productLongDescriptionSubTitle">Composition</p><p>100% poliéster</p>]]></long-description>
<long-description xml:lang="fr"><![CDATA[<p class="productLongDescriptionSubTitle">Composition</p><p>100 % polyester</p>]]></long-description>
<long-description xml:lang="ru"><![CDATA[<p>Женский свитшот SomeBrand из зимней коллекции одежды 2014. Характеристики: влаговыводящая технология DRY-FLIGHT, эластичный флис из полиэстера (250 г), теплые карманы на молнии для ладошек.</p><p class="productLongDescriptionSubTitle"></p><p>100% полиэстер</p>]]></long-description>
<!-- === PICTURES === -->
<images>
<image-group view-type="hi-res">
<image path="catalog-products/all/default/hi-res/111111111_aurora,v_kpv0_frt1.jpg">
<alt xml:lang="x-default"><![CDATA[Aurora 111111111]]></alt>
<title xml:lang="x-default"><![CDATA[Aurora 111111111]]></title>
</image>
<image path="catalog-products/all/default/hi-res/111111111_aurora,v_kpv0_frt2.jpg">
<alt xml:lang="x-default"><![CDATA[Aurora 111111111]]></alt>
<title xml:lang="x-default"><![CDATA[Aurora 111111111]]></title>
</image>
<image path="catalog-products/all/default/hi-res/111111111_aurora,v_kpv0_bck1.jpg">
<alt xml:lang="x-default"><![CDATA[Aurora 111111111]]></alt>
<title xml:lang="x-default"><![CDATA[Aurora 111111111]]></title>
</image>
</image-group>
</images>
</product>
</catalog>
Ex desired dataset:
<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns="http://some.namespace/uri" catalog-id="catalog-products">
<category category-id="sales_women">
<display-name xml:lang="x-default"><![CDATA[Women's Sales]]></display-name>
<display-name xml:lang="en"><![CDATA[Sales for Women]]></display-name>
<display-name xml:lang="en-US"><![CDATA[Women's Sales]]></display-name>
</category>
<product product-id="111111111">
<display-name xml:lang="x-default"><![CDATA[Aurora]]></display-name>
<display-name xml:lang="en-US"><![CDATA[Aurora Fleece]]></display-name>
<short-description xml:lang="de"><![CDATA[Aurora - Fleece-Top für Damen]]></short-description>
<short-description xml:lang="x-default"><![CDATA[Aurora - Sweatshirt for women]]></short-description>
<short-description xml:lang="en-US"><![CDATA[Snow Fleece & Softshells - Aurora Fleece]]></short-description>
<short-description xml:lang="es"><![CDATA[Aurora - Top polar de mujer]]></short-description>
<short-description xml:lang="fr"><![CDATA[Aurora - haut en polaire femme]]></short-description>
<short-description xml:lang="ru"><![CDATA[Свитшот SomeBrand для девушек]]></short-description>
<long-description xml:lang="x-default"><![CDATA[<p class="productLongDescriptionTitle"><p class="productLongDescriptionSubTitle">Composition</p><p>100% Polyester</p>]]></long-description>
<long-description xml:lang="en-US"><![CDATA[<p class="productLongDescriptionTitle"></p><p>The stretchy, polar fleece Aurora zip-up shields you from the elements with street-savvy style to have you standing out on the slopes and the sidewalk. Designed with a tailored fit, tech details include zippered hand warmer pockets, a lyrca binding finish, a chest pocket, flatlock seams for smooth comfort, and ergonomic seams for support. Imported. 100% polyester polar fleece.</p><p class="productLongDescriptionSubTitle">Composition</p><p>100% Polyester
Polar Fleece</p>]]></long-description>
<long-description xml:lang="es"><![CDATA[<p class="productLongDescriptionSubTitle">Composition</p><p>100% poliéster</p>]]></long-description>
<long-description xml:lang="ru"><![CDATA[<p>Женский свитшот SomeBrand из зимней коллекции одежды 2014. Характеристики: влаговыводящая технология DRY-FLIGHT, эластичный флис из полиэстера (250 г), теплые карманы на молнии для ладошек.</p><p class="productLongDescriptionSubTitle"></p><p>100% полиэстер</p>]]></long-description>
<!-- === PICTURES === -->
<images>
<image-group view-type="hi-res">
<image path="catalog-products/all/default/hi-res/111111111_aurora,v_kpv0_frt1.jpg">
<alt xml:lang="x-default"><![CDATA[Aurora 111111111]]></alt>
<title xml:lang="x-default"><![CDATA[Aurora 111111111]]></title>
</image>
<image path="catalog-products/all/default/hi-res/111111111_aurora,v_kpv0_frt2.jpg">
<alt xml:lang="x-default"><![CDATA[Aurora 111111111]]></alt>
<title xml:lang="x-default"><![CDATA[Aurora 111111111]]></title>
</image>
<image path="catalog-products/all/default/hi-res/111111111_aurora,v_kpv0_bck1.jpg">
<alt xml:lang="x-default"><![CDATA[Aurora 111111111]]></alt>
<title xml:lang="x-default"><![CDATA[Aurora 111111111]]></title>
</image>
</image-group>
</images>
</product>
</catalog>