XML Attributes vs Elements [duplicate]

One of the better thought-out element vs attribute arguments comes from the UK GovTalk guidelines. This defines the modelling techniques used for government-related XML exchanges, but it stands on its own merits and is worth considering.

Schemas MUST be designed so that elements are the main holders of information content in the XML instances. Attributes are more suited to holding ancillary metadata – simple items providing more information about the element content. Attributes MUST NOT be used to qualify other attributes where this could cause ambiguity.

Unlike elements, attributes cannot hold structured data. For this reason, elements are preferred as the principal holders of information content. However, allowing the use of attributes to hold metadata about an element's content (for example, the format of a date, a unit of measure or the identification of a value set) can make an instance document simpler and easier to understand.

A date of birth might be represented in a message as:

 <DateOfBirth>1975-06-03</DateOfBirth>

However, more information might be required, such as how that date of birth has been verified. This could be defined as an attribute, making the element in a message look like:

<DateOfBirth VerifiedBy="View of Birth Certificate">1975-06-03</DateOfBirth>

The following would be inappropriate:

<DateOfBirth VerifiedBy="View of Birth Certificate" ValueSet="ISO 8601" Code="2">1975-06-03</DateOfBirth>

It is not clear here whether the Code is qualifying the VerifiedBy or the ValueSet attribute. A more appropriate rendition would be:

 <DateOfBirth>    
   <VerifiedBy Code="2">View of Birth Certificate</VerifiedBy>     
   <Value ValueSet="ISO 8601">1975-06-03</Value>
 </DateOfBirth>

回答3:

Personally I like using attributes for simple single-valued properties. Elements are (obviously) more suitable for complex types or repeated values.

For single-valued properties, attributes lead to more compact XML and simpler addressing in most APIs.

回答4:

It's largely a matter of preference. I use Elements for grouping and attributes for data where possible as I see this as more compact than the alternative.

For example I prefer.....

<?xml version="1.0" encoding="utf-8"?>
<data>
    <people>
        <person name="Rory" surname="Becker" age="30" />
        <person name="Travis" surname="Illig" age="32" />
        <person name="Scott" surname="Hanselman" age="34" />
    </people>
</data>

...Instead of....

<?xml version="1.0" encoding="utf-8"?>
<data>
    <people>
        <person>
            <name>Rory</name>
            <surname>Becker</surname>
            <age>30</age>
        </person>
        <person>
            <name>Travis</name>
            <surname>Illig</surname>
            <age>32</age>
        </person>
        <person>
            <name>Scott</name>
            <surname>Hanselman</surname>
            <age>34</age>
        </person>
    </people>
</data>

However if I have data which does not represent easily inside of say 20-30 characters or contains many quotes or other characters that need escaping then I'd say it's time to break out the elements... possibly with CData blocks.

<?xml version="1.0" encoding="utf-8"?>
<data>
    <people>
        <person name="Rory" surname="Becker" age="30" >
            <comment>A programmer whose interested in all sorts of misc stuff. His Blog can be found at http://rorybecker.blogspot.com and he's on twitter as @RoryBecker</comment>
        </person>
        <person name="Travis" surname="Illig" age="32" >
            <comment>A cool guy for who has helped me out with all sorts of SVn information</comment>
        </person>
        <person name="Scott" surname="Hanselman" age="34" >
            <comment>Scott works for MS and has a great podcast available at http://www.hanselminutes.com </comment>
        </person>
    </people>
</data>

回答5:

As a general rule, I avoid attributes altogether. Yes, attributes are more compact, but elements are more flexible, and flexibility is one of the most important advantages of using a data format like XML. What is a single value today can become a list of values tomorrow.

Also, if everything's an element, you never have to remember how you modeled any particular bit of information. Not using attributes means you have one less thing to think about.

回答6:

Check out Elements vs. attributes by Ned Batchelder.

Nice explanation and a good list of the benefits and disadvantages of Elements and Attributes.

He boils it down to:

Recommendation: Use elements for data that will be produced or consumed by a business application, and attributes for metadata.

Important: Please see @maryisdead's comment below for further clarification.

回答7:

The limitations on attributes tell you where you can and can't use them: the attribute names must be unique, their order cannot be significant, and both the name and the value can contain only text. Elements, by contrast, can have non-unique names, have significant ordering, and can have mixed content.

Attributes are usable in domains where they map onto data structures that follow those rules: the names and values of properties on an object, of columns in a row of a table, of entries in a dictionary. (But not if the properties aren't all value types, or the entries in the dictionary aren't strings.)

回答8:

My personal rule of thumb: if an element can contain only one of that thing, and its an atomic data (id, name, age, type, etc...) it should be an attribute otherwise an element.

回答9:

I tend to use elements when it's data that a human reader would need to know and attributes when it's only for processing (e.g. IDs). This means that I rarely use attributes, as the majority of the data is relevant to the domain being modeled.

回答10:

Here is another strategy that can help distinguishing elements from attributes: think of objects and keep in mind MVC.

Objects can have members (object variables) and properties (members with setters and getters). Properties are highly useful with MVC design, allowing change notification mechanism.

If this is the direction taken, attributes will be used for internal application data that cannot be changed by the user; classic examples will be ID or DATE_MODIFIED. Elements will therefore be used to data that can be modified by users.

So the following would make sense considering the librarian first add a book item (or a magazine), and then can edit its name author ISBN etc:

<?xml version="1.0" encoding="utf-8"?>
<item id="69" type="book">
    <authors count="1">
        <author>
            <name>John Smith</name>
        <author>
    </authors>
    <ISBN>123456790</ISBN>
</item>