(Note: I cannot change structure of the XML I receive. I am only able to change how I validate it.)
Let's say I can get XML like this:
<Address Field="Street" Value="123 Main"/>
<Address Field="StreetPartTwo" Value="Unit B"/>
<Address Field="State" Value="CO"/>
<Address Field="Zip" Value="80020"/>
<Address Field="SomeOtherCrazyValue" Value="Foo"/>
I need to create an XSD schema that validates that "Street", "State" and "Zip" must be present. But I don't care if either "StreetPartTwo" and/or "SomeOtherCrazyValue" happen to be present too.
If I knew that only the three I care about could be included (and that each would only be included once), I could do something like this:
<xs:element name="Address" type="addressType" maxOccurs="unbounded" minOccurs="3"/>
<xs:complexType name="addressType">
<xs:attribute name="Field" use="required">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Street"/>
<xs:enumeration value="State"/>
<xs:enumeration value="Zip"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:complexType>
But this won't work with my case because I may also receive those other Address elements (that also have "Field" attributes) that I don't care about.
Any ideas how I can ensure the stuff I care about is present but let the other stuff in too?
TIA! Sean
I have the same problem like you but overcome with a trick.
Request XML
Now use a xslt and convert it,
XSLT code
The XML become
Now apply XSD
You cannot do the validation you seek, with just XML Schema.
According to the "XML Schema Part 1: Structures" specification ...
It's not to say that you cannot build a schema that will validate a correct document. What it means is, you cannot build a schema that will fail to validate on some incorrect documents. And when I say "incorrect", I mean documents that violate the constraints you stated in English.
For example, suppose you have a document that includes three Street elements, like this:
According to your schema, that document is a valid address. It's possible to add a xs:unique constraint to your schema so that it would reject such broken documents. But even with a xs:unique, validating against such a schema would declare that some other incorrect documents are valid - for example a document with three
<Address>
elements, each of which has a uniqueField
attribute, but none of which hasField="Zip"
.In fact it is not possible to produce a W3C XML Schema that formally codifies your stated constraints. The
<xs:all>
element almost gets you threre, but it applies only to elements, not to attributes. And, it cannot be used with an extension, so you can't say, in W3C XML Schema, "all these elements in any order, plus any other ones".In order to perform the validation you seek, your options are:
For the first option, I think you could use Relax NG to do it. The downside of that is, it's not a standard and as far as I can tell, it is neither widely supported nor growing. It would be like learning Gaelic in order to express a thought. There's nothing wrong with Gaelic, but it's sort of a linguistic cul-de-sac, and I think RelaxNG is, too.
For the second option, an approach would be to validate against your schema as the first step, and then, as the second step:
A. apply an XSL transform which would convert
<Address>
elements into elements named for the value of their Field attribute. The output of that transform would look like this:B. validate the output of that transform against a different schema, which looks something like this:
You would need to extend that schema to handle other elements like
<SomeOtherCrazyValue>
in the output of the transform. Or you could structure the xsl transform to just not emit elements that are not one of {State,Street,Zip}.Just to be clear, I understand that you cannot change the XML that you receive. This approach wouldn't require that. It just uses a funky 2-step validation approach. Once the 2nd validation step completes, you could discard the result of the transform.
EDIT - Actually, Sean, thinking about this again, you could just use step B. Suppose your XSL transform just Removes from the document only
<Address>
elements that do not have State, Street or Zip for the Field attribute value. In other words, there would be no<Address Field="SomeOtherCrazyValue"...>
. The result of that transform could be validated with your schema, using a maxOccurs="3", minOccurs="3", and an xs:unique.