Sorting words according to letters of an old Semit

2019-07-24 17:10发布

I use XSLT 3.0, Saxon-PE 9.7.

I need to sort orth according to the Ugaritic language, close to Hebrew but with additional characters.

I have tried:

 <xsl:sort select="orth" data-type="text" order="ascending" lang="uga"/>

But the proposed order is not correct. So I think I need to describe the Ugaritic alphabetic order. How can I do?

In advance, thank you very much.

2条回答
爷、活的狠高调
2楼-- · 2019-07-24 17:26

Saxon allows you to define your own collation in its configuration file, you basically have to set up a configuration file with a section like

 <collations>
      <collation uri="http://example.com/uga-trans"
      rules="&lt; ʾa &lt; b &lt; g &lt; ḫ &lt; d &lt; h &lt; w &lt; z &lt; ḥ &lt; ṭ &lt; y &lt; k &lt; š &lt; l &lt; m &lt; ḏ &lt; n &lt; ẓ &lt; s &lt; ʿ &lt; p &lt; ṣ &lt; q &lt; r &lt; ṯ &lt; ġ &lt; t &lt; ʾi &lt; ʾu &lt; s2"/>
 </collations>

where the uri attribute defines a URI as the name for your collation that you can then use in the collation attribute of an xsl:sort:

            <xsl:perform-sort select="$input-seq">
                <xsl:sort select="string()" collation="http://example.com/uga-trans"/>
            </xsl:perform-sort> 

The syntax to be used in the rules attribute is the one defined for the Java class RuleBasedCollator https://docs.oracle.com/javase/7/docs/api/java/text/RuleBasedCollator.html, it has an example there for Norwegian. The only caveat is that the Java syntax is plain text while the Saxon configuration is XML so the < to define the ordering has to be escaped in the rules attribute as &lt;.

I have set up above a rule based on the transcription sequence presented in the Wikipedia article https://en.wikipedia.org/wiki/Ugaritic_alphabet. Whether that is the one you are looking for I am not sure.

You can run Saxon from the command line with -config:yourconfiguationfile.xml to use such a configuration, oXygen has a field in the Saxon specific transformation scenario dialog to select a configuration file.

查看更多
Root(大扎)
3楼-- · 2019-07-24 17:33

Im not sure if this will be the best solution, but thats the one I know.

The code you are searching for is:

      <xsl:sort select="((orth='character1') * 1) + ((orth='character2') * 2) + ((orth='character3') * 3) ..." data-type="text" order="ascending"/>

You need to do this for every character of the alphabet. The lower the multiplication, the earlier it appears in the result. Basically you are defining your own order for specified values.

查看更多
登录 后发表回答