Best practice for meta data in a html document?

2019-03-09 06:04发布

问题:

I work on a large scale, high volume, public facing web application. The successful operation of the application is very important to the business, and so there are a number of MI tools that run against it.

One of these MI tools essentially looks at the html that is sent to the browser for each page request (I've simplified it quite a lot, but for the purpose of this question, its a tool that does some analysis on the html)

For this MI tool to get the data it needs, we put meta data in the head element. Currently we do it as html comments:

<!doctype html>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" class="">
<head>
    <!-- details = 52:AS6[rxSdsMd4RgYXJgeabsRAVBZ:0406139009] -->
    <!-- policy id = 1234567890 -->
    <!-- party id = 0987654321 -->
    <!-- email address = user@email.com -->
    <!-- error = 49 -->
    <!-- subsessionid = bffd5bc0-a03e-42e5-a531-50529dae57e3-->
    ...

And the tool simply looks for a given meta data comment with a regex

As this data is meta data, I'd like to change it to html meta tags because it feels semantically correct. Something like this:

<!doctype html>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" class="">
<head>
    <meta name="details" content="52:AS6[rxSdsMd4RgYXJgeabsRAVBZ:0406139009]" />
    <meta name="policyId" content="1234567890" />
    <meta name="partyId" content="0987654321" />
    <meta name="emailAddress" content="user@email.com" />
    <meta name="error" content="49" />
    <meta name="subsessionid" content="bffd5bc0-a03e-42e5-a531-50529dae57e3" />
    ...

This feels more semantic, and I can get the MI tool to work with it no problem - just a case of changing the regexes. However it now gives me a problem with the w3c validator. It wont validate because the meta names I'm using are not recognised. I get the error "Bad value details for attribute name on element meta: Keyword details is not registered." and it suggests I register these name values on the WHATWG wiki.

Whilst I could do this it doesn't feel right. Some of my meta tags are 'generic' (such as error and emailAddress) so I could probably find an already registered name value and use that. However, most of them are industry/organisation specific. It feels wrong to register a public name value called subsessionid or partyId as these are specific to my organisation and the application.

So, the question is - what is considered best practice in this case? Should I leave them as html comments? Should I use meta tags as above and not worry that w3c validation fails? (though that is increasingly important to the organisation) Should I attempt to register my meta name values on WHATWG wiki, but knowing they are not very generic? Or is there another solution?

Appreciate your thoughts, cheers

Nathan


Edited to show the the final solution:

The full answer I'm going with is as follows. Its based on Rich Bradshaws answer, so his is the accepted one, but this is what I'm going with for completeness:

<!doctype html>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" class="">
<head>
    <meta name="application-name" content="Our app name" 
        data-details="52:AS6[rxSdsMd4RgYXJgeabsRAVBZ:0406139009]" 
        data-policyId="1234567890"
        data-partyId="0987654321"
        data-emailAddress="user@email.com"
        data-error="49"
        data-subsessionid="bffd5bc0-a03e-42e5-a531-50529dae57e3"
    />
    ...

This validates, so all boxes ticked :)

回答1:

W3C validation is meaningless. HTML != XML, so there isn't any schema to validate it. No browser will choke because you added a meta element with an unregistered name. If you really are worried, you could use the data attribute on a meta element like:

<meta data-details="52:AS6[rxSdsMd4RgYXJgeabsRAVBZ:0406139009]" data-policyId="0123456789" />

at least then you know no future spec will give meaning to your data.

For more info read: http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#custom-data-attribute



回答2:

While your example may work, note that the keyword application-name is for Web applications only.

For usual webpages not being web applications, or if no application-name shall be given, see some alternatives:

Using data-* attributes in the head

No need for a meta element.

<!DOCTYPE html>
<html>
<head
    data-details="52:AS6[rxSdsMd4RgYXJgeabsRAVBZ:0406139009]" 
    data-policyId="1234567890"
    data-partyId="0987654321"
    data-emailAddress="user@email.com"
    data-error="49"
    data-subsessionid="bffd5bc0-a03e-42e5-a531-50529dae57e3">
</head>

Using Microdata

You could create a vocabulary, but that’s not required for local use.

<!DOCTYPE html>
<html>
<head itemscope>
  <meta itemprop="details" content="52:AS6[rxSdsMd4RgYXJgeabsRAVBZ:0406139009]" />
  <meta itemprop="policyId" content="1234567890" />
  <meta itemprop="partyId" content="0987654321" />
  <link itemprop="emailAddress" href="mailto:user@email.com" /> <!-- or use a meta element if you don’t want to provide a full URI with "mailto:" scheme -->
  <meta itemprop="error" content="49" />
  <meta itemprop="subsessionid" content="bffd5bc0-a03e-42e5-a531-50529dae57e3" />
</head>

Using data in a script

The script element can be used for data blocks. You can choose any format that suits your needs. Example with plain text:

<!DOCTYPE html>
<html>
<head>
  <script type="text/plain">
    details = 52:AS6[rxSdsMd4RgYXJgeabsRAVBZ:0406139009]
    policyId = 1234567890
    partyId = 0987654321
    emailAddress = user@email.com
    error = 49
    subsessionid = bffd5bc0-a03e-42e5-a531-50529dae57e3
  </script>
</head>


回答3:

What if you try using the data- format to add a custom attribute to them, something like data-type or data-name and omitting the real name attribute or maybe setting it all to "abstract" or something (I donno if the validator will give problems for repeated meta names):

<meta data-name="details" content="52:AS6[rxSdsMd4RgYXJgeabsRAVBZ:0406139009]" />

So you could reference to that data-name to work with your meta stuff...

http://html5doctor.com/html5-custom-data-attributes/



回答4:

Either option would technically work, although the solution could come down to how your organisation feels about page validation.

As you say, adding information into custom metadata tags will invalidate your markup.

For my organisation, page validation is part of technical accessibility and is considered very important. Doing anything that would prevent pages from validating would not be allowed.

I wouldn't attempt to register new metadata names and values as these are specific to your organisation and not for public use.

I would probably leave this information as HTML comments if this is already working for your organisation.