I have four tables
create table entities{
integer id;
string name;
}
create table users{
integer id;//fk to entities
string email;
}
create table groups{
integer id;//fk to entities
}
create table group_members{
integer group_id; //fk to group
integer entity_id;//fk to entity
}
I want to make a query that returns all groups where a user belongs, directly or indirectly. The obvious solution is to make a recursion at the application level. I’m wondering what changes can I make to my data model to decrease the database access and as a result have a better performance.
If you want a truly theoretically infinite level of nesting, then recursion is the only option, which precludes any sane version of SQL. If you're willing to limit it, then there are a number of other options.
Check out this question.
I don't think there is a need for recursion here as the solution posted by barry-brown seems adequate. If you need a group to be able to be a member of a group, then the tree traversal method offered by Dems works well. Inserts, deletes and updates are pretty straightforward with this scheme, and retrieving the entire hierarchy is accomplished with a single select.
I would suggest including a parent_id field in your group_members table (assuming that is the point at which your recursive relationship occurs). In a navigation editor I've created a nodes table like so:
My editor creates hierarchically-related objects from a C# node class
The Nodes property contains a list of child nodes. When the business layer loads the hierarchy, it rectifies the parent/child relationships. When the nav editor saves, I recursively set the left and right property values, then save to the database. That lets me get the data out in the correct order meaning I can set parent/child references during retrieval instead of having to make a second pass. Also means that anything else that needs to display the hierarchy ( say, a report) can easily get the node list out in the correct order.
Without a parent_id field, you can retrieve a breadcrumb trail to the current node with
where @id is the id of the node you're interested in.
Pretty obvious stuff, really, but it applies to items such as nested group membership that might not be obvious, and as others have said eliminates the need to slow recursive SQL.
You can do the following:
Can you clarify the difference between an entity and a user? Otherwise, your tables look OK. You are making an assumption that there is a many-to-many relationship between groups and entities.
In any case, with standard SQL use this query:
This will give you a list of names and group_ids, one pair per line. If an entity is a member of multiple groups, the entity will be listed several times.
If you're wondering why there's no JOIN to the groups table, it's because there's no data from the groups table that isn't already in the group_members table. If you included, say, a group name in the groups table, and you wanted that group name to be shown, then you'd have to join with groups, too.
Some SQL variants have commands related to reporting. They would allow you to list multiple groups on the same line as a single entity. But it's not standard and wouldn't work across all platforms.
There are ways of avoiding recursion in tree hierarchy queries (in opposition to what people have said here).
The one I've used most is Nested Sets.
As with all life and technical decisions, however, there are trade offs to be made. Nested Sets are often slower to update but much faster to query. There are clever and complicated ways of improving the speed of updating the hierarchy, but there's another trade-off; performance vs code complexity.
A simple example of a nested set...
Tree View:
Nested Set Representation
You'll want to read the article I linked to understand this fully, but I'll try to give a short explanation.
An item is a member of another item if (the child's "lft" (Left) value is greater than the parent's "ltf" value) AND (the child's "rgt" value is less than the parent's "rgt" value)
"Flash" is therfore a member of "MP3 PLAYERS", "Portable Electronics" and "Electronics"
Or, conversley, the members of "Portable Electronics" are:
- MP3 Players
- Flash
- CD Players
- 2 Way Radios
Joe Celko has an entire book on "Trees and Hierarchies in SQL". There are more options than you think, but lots of trade off's to make.
Note: Never say something can't be done, some mofo will turn up to show you that in can.
In
Oracle
:In
SQL Server
:In
PostgreSQL 8.4
:In
PostgreSQL 8.3
and below: