How to load an XHTML file into an XElement using a

2019-07-14 22:20发布

问题:

I am trying to get an XHTML file loaded into an LINQ XElement. However, I am running into problems with the resolver. The problem has to do with the following definition:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

I have a custom XmlUrlResolver with an overridden GetEntity which converts links such as http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd to a local resource stream. This works fine for almost the entire XHTML DTD. The only one I am unable to actually resolve is the Uri "-//W3C//DTD XHTML 1.0 Transitional//EN" and I am not sure what I should be doing with it.

    public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn)
    {
        var resourceName = "ePub.DTD." + absoluteUri.Segments[absoluteUri.Segments.GetLength(0) - 1];
        if (_resources.Contains(resourceName))
        {
            Stream dataStream = Assembly.GetExecutingAssembly().GetManifestResourceStream(resourceName);
            return dataStream;
        }
        return base.GetEntity(absoluteUri, role, ofObjectToReturn);
    }

As you see in the above code, anything I cannot resolve is handled by the default XmlUrlResolver. This means the above link starting with -//W3C/. The base method however throws an DirectoryNotFoundException however. Continuing will load the XElement just fine. If I instead return an empty stream it causes an error to be throw during loading of the XHTML into the XElement.

Any clues someone might have about handling such a PUBLIC definition with a custom XmlUrlResolver?

回答1:

Answer stolen from Microsoft boards, somewhere:

This behavior is by design. When both the public ID and system ID are specified in the DOCTYPE declaration, the XmlReader first tries if the XmlResolver.GetEntity understands the public identifier ("-//W3C//DTD XHTML 1.1//EN"). So it calls GetEntity with the public ID and if the resolver does not understand it (like the XmlUrlResolver), it throws an exception. The XmlReader catches the exception and calls the GetEntity, but this time with the system identifier (“http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd").

Thanks, -Helena Kotas, System.Xml Developer

Gepost door Microsoft op 10-5-2006 om 17:34