Comments inside HTML/SGML/XML/DTD declarations

2019-09-12 03:06发布

问题:

In the W3C HTML 4.01 DTDs and earlier, inline comments are frequently used within declarations.

For example, the HTML 2.0 Strict DTD has:

<!ENTITY % HTML.Version
    "-//IETF//DTD HTML 2.0 Strict//EN"

        -- Typical usage:

            <!DOCTYPE HTML PUBLIC
        "-//IETF//DTD HTML Strict//EN">
        <html>
        ...
        </html>
    --
    >

where the HTML entity declaration contains a comment between two double hyphens --.

However, DTD validators seem to flat out reject these sorts of internal comments and throw an error.

Are the validators wrong, or are the W3C DTDs not well-formed?


Answer:

In looking into it further, it seems that this is due to differences between the SGML and XML specifications.

Essentially, SGML defines comments as beginning and ending with -- anywhere inside a declaration construct <! >, whereas XML requires comments to begin and end with the <!-- and --> delimiters, respectively, as independent constructs.

Because HTML up to version 4.01 was based on SGML, comments within declarations were allowed and were used by the official DTDs.

However, most DTD validators seem to only check for compliance with the simpler XML specification and, therefore, get confused by intra-declaration comments, barfing errors.

回答1:

In looking into it further, it seems that this is due to differences between the SGML and XML specifications.

Essentially, SGML defines comments as beginning and ending with -- anywhere inside a declaration construct , whereas XML requires comments to begin and end with the delimiters, respectively, as independent constructs.

Because HTML up to version 4.01 was based on SGML, comments within declarations were allowed and were used by the official DTDs.

However, most DTD validators seem to only check for compliance with the simpler XML specification and, therefore, get confused by intra-declaration comments, barfing errors.