I have a pedantic argument that needs resolution.
As a proper koolaid drinker when it comes to HTML, I'm all about semantic markup. As a result, of course I hate to see tables where they don't belong. The rule of thumb for tables is that you should only use them for "tabular data", but it has come to my attention that this is a really poorly defined phrase. I wanted to make the following data a table, but others at my office disagreed that a table would be semantically correct in this case (as opposed to a dl
or ul
, etc):
------------------
| SomeEmployee |
|----------------|
| Field | val |
| Field | val |
| Field | val |
| Field | val |
------------------
Asking around the office (and the interwebs), I got some of the following answers on what makes data "tabular":
- "Anything that you would put in a spreadsheet" (I've seen entire design mockups created in spreadsheets, so this seems somewhat lacking to me)
- Data that would map well to a database table (e.g. row and column data, specifically)
- "Text, preformatted text, images, links, forms, form fields, other tables, etc." (thanks W3C, that's really helpful)
And so on and so forth. None of these seem to be canonical definitions, and they don't provide great dividing lines to make decisions on. So, I ask you, my clever compatriots: how should we define tabular data.
If at all possible, please cite the sources for your answers to prevent a string of "well I think" answers.
Thanks!
Joe
Wikipedia has some down-to-earth rules in its internal article-writing guidelines. They're far from an exhaustive definition but work well in real-world use IMO.
- When tables are appropriate
- When tables may not be appropriate
The whole definition is worth reading, but one paragraph strikes me as especially nice:
Before you format a list in table form, consider whether the information will be more clearly conveyed by virtue of having rows and columns. If so, then a table is probably a good choice. If there is no obvious benefit to having rows and columns, then a table is probably not the best choice.
The problem with your example is that it is in two columns.
In that limit, the distinction between whether to use a TABLE tag or a DL tag blurs somewhat. The only way to distinguish would be that in the dl, the DT tag should be a label and the dd should hold "data". If the "data" you put in the DT tag is NOT a label (aka Metadata) to the DD tag, then you should use a table. In your example, that is a sort of dictionary, I would definitely use a dl tag. I would use a table where the data in each of the two columns represent INDEPENDENT attributes of one thing. But like I say, the distinction blurs.
In your "table", the Field column looks to me more like metadata. So the rule there would be: always use the most semantically specific tag. In this case, a DL.
What I would NEVER do is use a ul or ol tag for this. Quite simply, they are for single column lists. Also, there is no requirement for each row of the list to be data, ie, attributes that represent a given thing. Content that goes in the UL or OL tags does not have metadata associated with it: the UL and OL tags don't provide any markup for metadata, unlike DL tags and TABLE tags.
Moving onto the the more generic aspect of your question, the following applies:
As with true love, you know tabular data when you see it.
And as with anything that you know when you see it, the precise definition would fill a volume of indigestible philosopy. For a start, you would need to define what you mean by data before you even ask the question.
With those caveats in mind, and just for the mental exercise, here we go:
A.- TABULAR DATA MUST BE STRUCTURED DATA: The data must be either hierarchical or relational. By this I mean that it should be POSSIBLE to cast the data into one or the other structure (or both).
This allows the following rules to be derived, that pretty much lock down what CAN go in a table of data, thereby answering your question and complying with W3C requirements for use of the tag:
ATOMICITY: Each ROW of data must represent an individual UNIT of the same thing. Ie, the data in each CELL of the table, MUST be an attribute of what each ROW of data represents. This quickly tells you why some things should go in UL tags: they are UNSTRUCTURED LISTS, ie, each row can refer to very different things, as opposed to STRUCTURED DATA, where each row ALWAYS represents a different instance of a class of thing.
CELLS: It should be viable to put each attribute of the thing in a tag (aka cell). The content of each cell must be data; layout elements are not allowed. Note that this excludes accessory things like column headers, which should not use tags and should use the functionally more appropriate tag.
ROW DEFINITION OR 'FORMAT' OF THE DATA: The data in each row () must map to a predefined 'format'. By format, I mean a list of attributes that describe a given thing. In what follows, the terms attribute and column are used interchangeably.
COLUMN ORDER: This format must be strictly ordered. Ie, the order of the columns must not change from row to row.
ROW INDEPENDENCE: The data in each row must not depend on the data in other rows. Nor should it depend on the existence of any other row. The data in each row is only dependent on the 'format'.
ROW INTEGRITY: The data in each row must comply with the constraints defined by the format.
RELATED DATA: An extension to the above rules is required to account for related and hierarchical data. This is easily done by extending the format to allow a column type that is itself a table.
In the above rules, nowhere does it state that the "columns" should be organised horizontally. This allows the rows to have more "structure" and still comply with rules 0 - 6. For example, a sub table of "child" data is allowed to appear BELOW the data corresponding to its "parent" record. Or not. Or a large Memo field could be displayed the other cells. Just because it is tabular data does not mean it cannot have layout.
IMAGE DATA: a product image in a catalogue is data. However, in the case of images, the column constraint is that the image MUST relate to the other "columns" of the "format". This knocks out, for example, a transparent gif that justs sets the physical height and or width of a row. As it has no intrinsic relationship to the other data in the row, it is not fulfilling rules 0 - 7.
B.- TABULAR DATA DOES NOT INCLUDE UNSTRUCTURED DATA. This is more of a corollary, but addresses the "evils" of the tag:
- Anything that is layout, eg, related to layout of content and the page, as opposed to the content itself, is NOT tabular data.
The above does little more than use bullshit to polish the following rule that you yourself stated in your question:
Anything that goes in a spread sheet or a database table, whilst excluding uses of spreadsheets that do not relate to data.
What isn't bullshit is the obvious, yet potentially vague statement that tabular data is nothing other than structured data. By structured, I mean that it has to constitute a "strongly typed" collection.
Again, I note that this definition excludes the text for column headers, captions, footers. In other words, tabular data is what goes with the tbody tag. Everything else within the table tag is metadata that refers to, but is not part of, the tabular data.
The above rules define data as something that complies with a column constraint.
So I guess now you need a definition of a column constraint. But then again, you know one when you see one...