schema validation with msxml in delphi

2019-01-22 10:08发布

问题:

I'm trying to validate an XML file against the schemas it references. (Using Delphi and MSXML2_TLB.) The (relevant part of the) code looks something like this:

procedure TfrmMain.ValidateXMLFile;
var
    xml: IXMLDOMDocument2;
    err: IXMLDOMParseError;
    schemas: IXMLDOMSchemaCollection;
begin
    xml := ComsDOMDocument.Create;
    if xml.load('Data/file.xml') then
    begin
        schemas := xml.namespaces;
        if schemas.length > 0 then
        begin
            xml.schemas := schemas;
            err := xml.validate;
        end;
    end;
end;

This has the result that cache is loaded (schemas.length > 0), but then the next assignment raises an exception: "only XMLSchemaCache-schemacollections can be used."

How should I go about this?

回答1:

I've come up with an approach that seems to work. I first load the schema's explicitly, then add themn to the schemacollection. Next I load the xml-file and assign the schemacollection to its schemas property. The solution now looks like this:

uses MSXML2_TLB  
That is:  
// Type Lib: C:\Windows\system32\msxml4.dll  
// LIBID: {F5078F18-C551-11D3-89B9-0000F81FE221}

function TfrmMain.ValidXML(
    const xmlFile: String; 
    out err: IXMLDOMParseError): Boolean;
var
    xml, xsd: IXMLDOMDocument2;
    cache: IXMLDOMSchemaCollection;
begin
    xsd := CoDOMDocument40.Create;
    xsd.Async := False;
    xsd.load('http://the.uri.com/schemalocation/schema.xsd');

    cache := CoXMLSchemaCache40.Create;
    cache.add('http://the.uri.com/schemalocation', xsd);

    xml := CoDOMDocument40.Create;
    xml.async := False;
    xml.schemas := cache;

    Result := xml.load(xmlFile);
    if not Result then
      err := xml.parseError
    else
      err := nil;
end;

It is important to use XMLSchemaCache40 or later. Earlier versions don't follow the W3C XML Schema standard, but only validate against XDR Schema, a MicroSoft specification.

The disadvantage of this solution is that I need to load the schema's explicitly. It seems to me that it should be possible to retrieve them automatically.



回答2:

While BennyBechDk might be on the right track, I have a few problems with his code that I'm going to correct below:

uses Classes, XMLIntf, xmlDoc, SysUtils;

function IsValidXMLDoc(aXmlDoc: IXMLDocument): boolean;
var
  validateDoc: IXMLDocument;
begin
  result := false;  // eliminate any sense of doubt, it starts false period.
  validateDoc := TXMLDocument.Create(nil);
  try   
    validateDoc.ParseOptions := [poResolveExternals, poValidateOnParse];
    validateDoc.XML := aXmlDoc.XML;
    validateDoc.Active := true;
    Result := True;
  except
    // for this example, I am going to eat the exception, normally this
    // exception should be handled and the message saved to display to 
    // the user.
  end;
end;

If you wanted the system to just raise the exception, then there is no reason to make it a function in the first place.

uses Classes, XMLIntf, XMLDoc, SysUtils;

procedure ValidateXMLDoc(aXmlDoc: IXMLDocument);
var
  validateDoc: IXMLDocument;
begin
  validateDoc := TXMLDocument.Create(nil);
  validateDoc.ParseOptions := [poResolveExternals, poValidateOnParse];
  validateDoc.XML := aXmlDoc.XML;
  validateDoc.Active := true;
end;

Because validateDoc is an interface, it will be disposed of properly as the function/procedure exits, there is no need to perform the disposal yourself. If you call ValidateXmlDoc and don't get an exception then it is valid. Personally I like the first call, IsValidXMLDoc which returns true if valid or false if not (and does not raise exceptions outside of itself).



回答3:

I worked on the Miel´s solution to solve the disadventage. I open the xml twice, once to get the namespaces, and the other, after create the schema collection, to validate the file. It works for me. It seems like the IXMLDOMDocument2, once open, don´t accepts set the schemas property.

function TForm1.ValidXML2(const xmlFile: String;
  out err: IXMLDOMParseError): Boolean;
var
  xml, xml2, xsd: IXMLDOMDocument2;
  schemas, cache: IXMLDOMSchemaCollection;
begin
  xml := CoDOMDocument.Create;
  if xml.load(xmlFile) then
    begin
    schemas := xml.namespaces;
    if schemas.length > 0 then
      begin
      xsd := CoDOMDocument40.Create;
      xsd.Async := False;
      xsd.load(schemas.namespaceURI[0]);
      cache := CoXMLSchemaCache40.Create;
      cache.add(schemas.namespaceURI[1], xsd);
      xml2 := CoDOMDocument40.Create;
      xml2.async := False;
      xml2.schemas := cache;
      Result := xml2.load(xmlFile);
      //err := xml.validate;
      if not Result then
        err := xml2.parseError
      else
        err := nil;
      end;
    end;


回答4:

I have previously validated XML documents using the following code:

Uses
  Classes, 
  XMLIntf, 
  SysUtils;

Function ValidateXMLDoc(aXmlDoc: IXMLDocument): boolean;
var
  validateDoc: IXMLDocument;
begin
  validateDoc := TXMLDocument.Create(nil);

  validateDoc.ParseOptions := [poResolveExternals, poValidateOnParse];
  validateDoc.XML := aXmlDoc.XML;

  validateDoc.Active := true;
  Result := True;
end;