I thought it was about time to have a look at OO databases and decided to use db4o for my next little project - a small library.
Consider the following objects: Book, Category.
A Book can be in 0-n categories and a Category can be applied to 0-m Books.
My first thought is to have a joining object such as BookCatecory but after a bit of Googling I see that this is not appropriate for 'Real OO'.
So another approach (recommended by many) is to have a list in both objects: Book.categories and Category.books. One side handles the relationship: Book.addCategory adds Category to Book.categories and Book to Category.books. How to handle commits and rollbacks when 2 objects are been altered within one method call?
What are your thoughts? The second approach has obvious advantages but, for me at least, the first 'feels' right (better normed).
If you use object database you don't need to care how relations are stored in database. You define classes and relationships between them. Please read the reference guided to your database. Examples of relationships:
n:n attribute, referencing from the parent
------------------------------------------------------------------
class Person {
List addresses;
}
class Address {
}
n:n attribute, referencing from the child
------------------------------------------------------------------
class Person {
}
class Address {
List persons
}
n:n attribute, bidirectional references
------------------------------------------------------------------
class Person {
List addresses;
}
class Address {
List persons
}
There are really only two ways I can think of to solve this problem, both of which you've mentioned. Personally, I would go with the first approach (creating a mapping object as an OO entity). This prevents you from keeping redundant information around and having to synchronize; it also means that if the association ends up having fields of its own (the date that the book was assigned to that category, let's say), they can be incorporated easily. We use this approach for a variety of associations in our system.
The OO entities would look like:
BookCategory {
Book book
Category category
}
Book {
Collection <BookCategory> categories
}
Category {
Collection <BookCategory> categories
}
Here you have to keep the relation object and the two collections in synch; however, the collections are optional in this case. Typically you could get the same information with an ORM query, something like:
select b.book from BookCategory b where b.category = MyCategory
The alternative is to have a setup like:
Book {
Collection<Category> categories
}
Category {
Collection<Books> books
}
If your ORM/DB tool automatically maintains the associations, this is fine; otherwise, you are stuck updating both collections. (In Hibernate, one side will have the property: inverse=true on the mapping; this side is not updated, so strictly speaking it doesn't need to be maintained. This seems to me like bad practice, though.)
If you typically only access the relation one way (for example, getting all of the books in a category), you could eliminate the collection on other side; then I think you would have to work around the ORM tool and use a native query in order to access the relationship from the other direction.
We use Hibernate (a java-based Object Relational Mapping tool) on our project; the Hibernate docs are a good reference for OO/relational design problems, though you may have to spend a little time learning Hibernate to make them useful:
http://docs.jboss.org/hibernate/stable/core/reference/en/html_single/#collections-ofvalues
HTH!
I think you're just a little hung up on the relational db way of thinking. Lists in each object is the right OO thing to do. Commits and rollbacks are no problem, they happen in a transaction that commits everything or rolls back everything.
In a pure OO database such as GemStone the objects themselves have collections of references to other objects. When the object is referenced from the application the OODBMS generates a proxy that wraps the object. The schema for this is just the persisted object and its collection of references to the objects it refers to. The OODBMS does not necessarily need a link entity.
With an O/R mapping layer (assuming it is clever enough to do M:M relationships) the M:M relationship is manifested as a collection of subsidiary references on the object itself which the O/R mapper resolves to the link entity behind the scenes. Not all O/R mappers do this, so you may have a separate link object.
Do you have any particular reason you wanted to use an ODBMS? For simple data structures (such as categorizing books) you generally won't find any advantage in ODBMS over RDBMS, and in fact will have an easier time working in the much-more-standardized world of RDBMS. ODBMS has very tangible advantages when you are working with complex data types or literal persistence/storage of dynamic objects. ODBMS also is cited as being much faster and more scalable than RDBMS, though I can offer little insight into this myself. Here are a couple pages that discuss RDBMS vs. ODBMS, however:
Whatever Happened to Object-Oriented Databases
Object-Oriented Database vs. Object-Rleational Database (SO)
I would avoid data duplication because then you run into all kinds of problems with merging the differences.
the trick to this is references.
the result is that I would have each object contain a collection of references to the other object type as well as having an independent collection of the other objects.
The matching table is a relational concept, unless that intermediary connecting class may have properties that are not attributable to either of the objects. It is there as it enables queries to be written in a powerful manner as it reduces the relation to 2 one to many relations and greatly reduces data duplication. If you did this in a relation database without the matching table then things would get evil very quickly - how would an update operate? Personally i find the attraction of oo databases to be stepping away from this
the way that i would tie all the objects together is via events in code to some kind of transaction handler to allow the caching of objects states. so rather than objects manipulating each others properties they request a change through the handler and await the result in a callback.