Many database systems don't allow comments or descriptions of tables and fields, so how do you go about documenting the purpose of a table/field apart from the obvious of having good naming conventions?
(Let's assume for now that "excellent" table and field names are not enough to document the full meaning of every table, field and relationship in the database.)
I know many people use UML diagrams to visualize the database, but I have rarely—if ever—seen a UML diagram including field comments. However, I have good experience with using comments inside .sql
files. The downside to this approach is that it requires the .sql
files to be manually kept up-to-date as the database structure changes over time—but if you do, you can also have it under version control.
Some other techniques I have seen are separate document describing database structure and relationships and manually maintained comments inside ORM code or other database-mapping code.
How have you solved this problem in the past? What methods exists and what are the various pros and cons associated with them? How you would you like this solved in "a perfect world"?
Update
As others have pointed out, most of the popular SQL engines do in fact allow comments, which is great. Oddly enough, people don't seem to be using these features much. At least not on the projects I have been involved with in the past.
It's a really simplistic approach, but I use a pair of wiki pages: one with the mysqldump of the database, and one written in a slightly more English-like format.
For the projects I've worked on, that's been sufficient (through the dozens of tables level). I don't know how well it might scale to larger projects (say in the hundreds of tables), but it's been good so far.
I've recently turned to writing markdown documentation, which includes linking to individual table
.sql
files (where tables and fields are hopefully intuitively named with plenty of comments).I keep the individual table schema's in version control, using the following command:
mysqldump --no-data --tab=./tables dbname
The schema for a single table allows you to see comments, indexes, unique keys etc. so is fairly self explanatory (well that is the idea at least).
The master markdown documentation has hyperlinks such as user table peppered throughout, so the reader can easily go to the different tables.
I comment my databases as I comment my programs. By writing good (I hope) comments in the source code (the SQL file containing the DDL instructions).
Using SQL COMMENT is another possibility. The good thing with them is that they are always with your objects, are backed up with them, etc. The bad thing is that they are more limited (for instance in length).
Since we use Rational Software Architect, we use its data discovery features to document our databases and then annotate them from there.
SQL Server has extended properties that can take care of this.
This article describes how do set them up in SQL Sever http://www.developer.com/db/article.php/3677766
MSDN Reference
It can be used in conjunction with RedGate SQL Doc to create a nice Data dictionary.
At one point I wrote a basic SQL parser that would parse CREATE TABLE statements and strip out specially formatted comments. These were then post-processed into LaTeX source and rendered to PDF. This was inspired by Javadoc and was used to create the documentation for This product. Subsequently a data dictionary feature was built into the warehouse manager and a modified version of the LaTeX generator was used to render the data dictionary from the warehouse manager.
On another project I used Visio - the version that comes with Visual Studio Enterprise Architect will forward engineer a database. The SQL so generated had the table and column comments rendered in comment strings that were fairly straightforward to parse. The tool I wrote generated MIF files that were be included into a spec document built with FrameMaker.
If you have a repository tool such as Powerdesigner you can maintain data models in it and get repository reports out that include the documentation you have entered. If you need deeper integration of your data dictionary with functional specifications (Quite useful for data warehouse systems where the ETL is complex and involves significant computation of derived values) you can still extract the metadata and write a utility to generate something that will integrate the data dictionary into a specification document. This also allows cross-referencing between data dictionary items and other specification documents and generation of indexes that cover the data dictionary definitions and related documentation such as a specification of how something is calculated with examples.