-->

XSD restriction that negates a matching string

2019-08-19 12:32发布

问题:

I want my XSD to validate the contents of a string. To be specific, I want to validate that a certain string does not occur.

Consider this rule, which will verify that my string occurs. Looking for all Linkelements starts with this particular string: /site/example.com

<xs:element name="Link" type="xs:normalizedString" minOccurs="0">
  <xs:simpleType>
    <xs:restriction base="xs:token">
      <xs:pattern value="(/site/example\.com).*"/>
    </xs:restriction>
  </xs:simpleType>
</xs:element>                 

In other words, the expression above verifies that all Link elements start with /site/example.com. How do you invert the expression above, so that it **verifies that no Link elements start with /site/example.com?

I tried the following regexp with no luck: /[^(site/example\.com)].*, so this is not working:

Not-working strategy 1 (negation of single character) I am aware that this probably would work for negating a single character, since this SO question does that: XML schema restriction pattern for not allowing empty strings

The suggested pattern in that question <xs:pattern value=".*[^\s].*" />

But negating only a single character does not work in this case, since it would correctly fail:

/site/example.com

but also it would incorrectly fail

/solutions

Not-working Strategy 2 (advanced regexp lookahead) According to this SO question ( Regular expression to match a line that doesn't contain a word? ), you could solve this with negative lookahead (?!expr).

So this will work in ordinary regexp:

^((?!/site/example.com).)*$

Now, unfortunately xsd validations support only limited regexps. According to this site, no lookaheads are supported: regular-expressions.info -- xsd

This pretty much describes what i have tried until now.

My question is, how do i negate a regular expression in an XSD schema?

回答1:

This is simpler to do in XSD 1.1, where you can use assertions to ensure that the value does not begin with the string you specify. But conceptually, it's simple enough even in XSD 1.0 and simple regular expressions: you want to ensure that the string does not begin with "/site/example.com". If it did begin that way, you'd have a logical conjunction of a series of facts about the string:

  • substring(., 1, 1) = '/'
  • substring(., 2, 1) = 's'
  • substring(., 3, 1) = 'i'
  • ...
  • substring(. 17, 1) = 'm'

You want to negate this conjunction of facts. Now, by De Morgan's Laws, ~(a and b and ... and z) is equivalent to (~a or ~b or ... or ~z). So you can do what you need by writing a disjunction of the following terms:

    [^/].*
    |.([^s].*)?
    |.{2}([^i].*)?
    |.{3}([^t].*)?
    |.{4}([^e].*)?
    |.{5}([^/].*)?
    |.{6}([^e].*)?
    |.{7}([^x].*)?
    |.{8}([^a].*)?
    |.{9}([^m].*)?
    |.{10}([^p].*)?
    |.{11}([^l].*)?
    |.{12}([^e].*)?
    |.{13}([^\.].*)?
    |.{14}([^c].*)?
    |.{15}([^o].*)?
    |.{16}([^m].*)?

In each term above the subexpression of the form [^s].* has been wrapped in (...)? -- the term .{2}([^i].*)? means any string beginning with two characters is OK if the third character is not an i or if there is no third character at all. This ensures that strings shorter than 17 characters in length are not excluded, even if they happen to be prefixes of the forbidden string.

Of course, to use this in an XSD schema document, you will need to remove all the whitespace, which makes the regex harder to read.

[Addition, June 2016] See also this related and more general question.



回答2:

You don't mention whether you are bound to XML Schema 1.0 and XPath 1.0, but if not it is possible to accomplish your goal with xs:assert's, along the lines of this (which may need some work - this is from memory...):

<xs:element name="Link" type="xs:normalizedString" minOccurs="0">
  <xs:simpleType>
    <xs:restriction base="xs:token">
      <xs:assert test="not( fn:starts-with( $value , '/site/example.com' ) )" />
    </xs:restriction>
  </xs:simpleType>
</xs:element>  

Some links of possible interest:

http://www.ibm.com/developerworks/library/x-xml11pt2/

http://www.w3.org/TR/xpath-functions/#func-starts-with

Cheers,