I would like to be able to calculate the family relationship between two individuals in a family tree, given the following data schema (simplified from my actual data schema, only showing columns that directly apply to this problem):
individual
----------
id
gender
child
----------
child_id
father_id
mother_id
With this structure, how can one calculate the relationship between two individual id's (i.e. cousin, great grand uncle, etc.).
Also, as there are actually two relationships (i.e. A-B may be nephew whereas B-A is uncle), how can one generate the complement to the other (given uncle, and assuming we know gender, how might we generate nephew?). This is more of a trivial question, the former is what I'm really interested in.
Thanks all!
This might help you, it's a lot of theory and implementation of SQL queries to generate and query tree structures
http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html
In particular, look at the adjacency list model which uses a family tree as example.
I solved this problem using adjacency list concept in java. One can have a node for every person and have its child relations associated to it on its node itself. Below is the code to find only Siblings and Cousins. However, you can enhance it according to your requirement. I wrote this code only for demonstration.
Below is the main code to add family people and to find relation among themselves.
}
This might help The Tree Relationship Calculator is an object that accepts an XML representation of a tree and will calculate the relationship of any two members within it. This article describes how relationships are calculated, and what terms like second cousin, or first cousin once removed, mean. This code includes an object for calculating relationships, written in JavaScript, as well as a web UI for rendering and interacting with the tree. The example project is setup as a classic ASP page.
http://www.codeproject.com/Articles/30315/Tree-Relationship-Calculator
You'll first need to calculate the Lowest Common Ancestor of both A and B. Call this Lowest Common Ancestor C.
Then calculate the distance in steps from C to A (CA) and C to B (CB). These values should be indexed into another table that determines the relationship based on these two values. For example:
You might keep the basic relations in this table, and add "great-" for additional distances on certain relations like grandfather, ex.: (0, 3) = great-grandfather.
Hopefully this will point you in the right direction. Best of luck!
UPDATE: (I can't comment below your code, since I don't have the reputation yet.)
Your function aggrandize_relationships is a little off, I think. You can simplify it by prefixing "grand" if the offset is 1 or greater, then prefix "great-" (offset - 1) times. Your version might include the prefix "great grand great grand" for very distant relatives.(Not sure if I have the correct parameter in this explanation, but hopefully you get the gist of it. Also, no idea if your family tree is going that far back, but the point remains valid.)
UPDATE TOO: Sorry, the above is incorrect. I misread the default case, and thought it recursively called the function again. In my defense, I wasn't familiar with the "2nd great grandfather" notation, and always used "great great grandfather" myself. Code onward!!
Below is my PHP implementation of my algorithm to calculate relationship. This is based upon the data schema I outlined in the original question. This only finds the "closest" i.e. shortest-path relationship between the two individuals, it does not resolve compound relationships such as half-siblings or double cousins.
Note that data access functions such as
get_father
andget_gender
are written in the style of a database abstraction layer I always use. It should be fairly straightforward to understand what is going on, basically all dbms-specific functions such asmysql_query
are replaced with generalized functions such asdb_query
; it is not very complicated at all, especially in the examples in this code, but feel free to post questions in comments if it isn't clear.As I had mentioned previously, the algorithm to determine LCA is far less than optimal. I plan to post a separate question to optimize that, and another to address the problem of calculating compound relationships such as double cousins.
Many thanks to everyone who helped prod me in the right direction! With your tips, this turned out to be much easier than I originally thought.
As strange as it might sound PROLOG seems to be the thing you're looking for. Given following ad-hoc program (http://www.pastey.net/117134 better colouring)
You could ask Prolog interpreter something like that:
with the answers:
It's a powerful tool, if you know how and when to use it. This seems exactly like a place for Prolog. I know it's not terribly popular, or easy to embed, but the impressive feature of wolphram alpha shown in one of the comments can be coded using nothing more than constructs used above, and this is Prolog 101.