How to ignore the validation of Unknown tags?

2019-01-27 22:07发布

One more challenge to the XSD capability,

I have been sending XML files by my clients, which will be having 0 or more undefined or [call] unexpected tags (May appear in hierarchy). Well they are redundant tags for me .. so I have got to ignore their presence, but along with them there are some set of tags which are required to be validated.

This is a sample XML:

<root>
  <undefined_1>one</undefined_1>
  <undefined_2>two</undefined_2>
  <node>to_be_validated</node>
  <undefined_3>two</undefined_3>
  <undefined_4>two</undefined_4>
</root>

And the XSD I tried with:

  <xs:element name="root" type="root"></xs:element>
  <xs:complexType name="root">
    <xs:sequence>
      <xs:any maxOccurs="2" minOccurs="0"/>
      <xs:element name="node" type="xs:string"/>
      <xs:any maxOccurs="2" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType

XSD doesn't allow this, due to certain reasons.
The above mentioned example is just a sample. The practical XML comes with the complex hierarchy of XML tags ..

Kindly let me know if you can get a hack of it.

By the way, The alternative solution is to insert XSL-transformation, before validation process. Well, I am avoiding it because I need to change the .Net code which triggers validation process, which is supported at the least by my company.

5条回答
forever°为你锁心
2楼-- · 2019-01-27 23:02

I faced the same problem.

Since I called the validation from .NET; I decided to suppress the specific validation error in ValidationEventHandler as a workaround. It worked for me.

    private void ValidationEventHandler(object sender, ValidationEventArgs e)
    {
        switch (e.Severity)
        {
            case XmlSeverityType.Warning:
                // Processing warnings
                break;
            case XmlSeverityType.Error:
                if (IgnoreUnknownTags
                    && e.Exception is XmlSchemaValidationException
                    && new Regex(
                        @"The element '.*' has invalid child element '.*'\."
                        + @" List of possible elements expected:'.*'\.")
                       .IsMatch(e.Exception.Message))
                {
                    return;
                }
                // Processing errors
                break;
            default:
                throw new InvalidEnumArgumentException("Severity should be one of the valid values");
        }
    }

It is important that Thread.CurrentUICulture must be set to English or CultureInfo.InvariantCulture for the current thread for this to work.

查看更多
小情绪 Triste *
3楼-- · 2019-01-27 23:06

Conclusion:

This is not possible with XSD. All the approaches I was trying to achieve the requirement were named as "ambiguous" by validation-tools, accompanying bunch of errors.

查看更多
Lonely孤独者°
4楼-- · 2019-01-27 23:06

You could make use of a new feature in XML 1.1 called "Open Content". In short, allows you to specify that additional "unknown" elements can be added to a complex type in various positions and what the parser should do if it hits any of those elements

Using XML 1.1, your complex type would become:

<xs:element name="root" type="root" />
<xs:complexType name="root"> 
  <xs:openContent mode="interleave">
    <xs:any namespace="##any" processContents="skip"/>
  </xs:openContent>

  <xs:sequence> 
    <xs:element name="node" type="xs:string"/> 
  </xs:sequence> 
</xs:complexType>

If you have a lot of complex types, you can also set a "default" open content mode at the top of your schema:

<xs:schema ...>
  <xs:defaultOpenContent mode="interleave">
    <xs:any namespace="##any" processContents="skip"/>
  </xs:defaultOpenContent>

  ...
</xs:schema>

The W3C spec for Open Content can be found at http://www.w3.org/TR/xmlschema11-1/#oc and there's a good writeup of this at http://www.ibm.com/developerworks/library/x-xml11pt3/#N102BA.

Unfortunately, .NET doesn't support XML 1.1 as of yet I can't find any free XML 1.1 processors - but a couple of paid-for options are:

查看更多
Juvenile、少年°
5楼-- · 2019-01-27 23:12

In case your not already done with this, you might try the following:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="root" type="root"></xs:element>
  <xs:complexType name="root">
    <xs:sequence>
      <xs:any maxOccurs="2" minOccurs="0" processContents="skip"/>
      <xs:element name="node" type="xs:string"/>
      <xs:any maxOccurs="2" minOccurs="0" processContents="skip"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Under Linux this works fine with xmllint using libxml version 20706.

查看更多
6楼-- · 2019-01-27 23:14

Maybe its is possible to use namespaces:

<xs:element name="root" type="root"></xs:element> 
  <xs:complexType name="root"> 
    <xs:sequence> 
      <xs:any maxOccurs="2" minOccurs="0" namespace="http://ns1.com" /> 
      <xs:element name="node" type="xs:string"/> 
      <xs:any maxOccurs="2" minOccurs="0" namespace="http://ns2.com"/> 
    </xs:sequence> 
  </xs:complexType>

This will probably validate.

查看更多
登录 后发表回答