I am creating a botanical database where the plants will be organized by their taxonomy:
Life Domain Kingdom Phylum Class Order Family Genus Species
I was considering using the example put forth by the article Managing Hierarchical Data in MySQL, however it is adding the above list as records inside the table....and I'm not sure if that is the best thing to do since I will be having multiple species per genus and multiple genus per family and so on. What would you suggest is the best way to approach this problem. Thanks in advance.
I worked with similar data, and I made it in 2 parts. In PostgreSQL syntax.
First is taxonomy structure (Family, Genus, Species, ...):
Second is description and storing of botanical data:
I'm not sure I really buy into that article. Graph structures would be needed when the categories itself are mutable. Such as, all the sudden taxonomists decided to add three new levels between genus and species, and so on.
From the article:
Actually, its exactly what it is intended for:
http://en.wikipedia.org/wiki/Hierarchical_database_model
I would first write a view that joined all of your tables so that you would have these as your columns:
Now you can query that view any way you like and not have to worry about any joins. Easy
:)
There are several ways of representing hierarchical data in a relational database, albeit a NoSQL solution might be easier to work with as @duffymo mentioned. So assuming an RDBMS, see my question on the topic for an enumeration of a half dozen possibilities. For your situation, I would lead with a materialized path to make seeing the family tree easy. If the hierarchy changes regularly I would probably also model as an adjacency list and update the materialized path using a trigger.
You can download complete taxonomy data from http://itis.gov and the data is updated more or less monthly. The data they provide includes a Materialized Path -- every species in the database has a string of all the levels above it, like a breadcrumbs string or a filesystem path.
I used this data to design a demo in my presentation Models for Hierarchical Data. I converted the materialized path data into Closure Table.
It sounds more like a graph. I'd wonder if NEO4J would be a better choice.