so basically I `m following a tutorial question which asks me the below. I am not too sure how to join 2 tables which do not contain the others FK, their (i.e. both of their FKs) are located in a 3rd table. Could I get some help/explanation?
My Answer
SELECT Forest.Fo_name, Species.Sp_name, Species.Sp_woodtype
FROM Forest
INNER JOIN Species
ON Tree.Tr_species=Tree.Tr_forest
WHERE Fo_loc='ARTIC'
ORDER BY Fo_name, Sp_name
"For forests found in the regions coded as "ARTIC" list the forest name & species name and species wood type found therein. Eliminate any duplicates and order the output by forest name and species name"
Species table
+--------------+------------------+------+--------------------------------+
| Field | Type | Key | Glossary |
+--------------+------------------+------+--------------------------------+
| sp_name | C(10) | PK | Species Name |
| sp_woodtype | C(10) | | Wood Yielded by tree |
| sp_maxht | I | | Max. Height |
+--------------+------------------+------+--------------------------------+
Forest table
+--------------+------------------+------+--------------------------------+
| Field | Type | Key | Glossary |
+--------------+------------------+------+--------------------------------+
| Fo_name | C(10) | PK | Forest Name |
| Fo_size | I | | Forest Area |
| Fo_loc | C(10) | | Geographical Area |
| Fo_comp | C(10) | | Forest Owner |
+--------------+------------------+------+--------------------------------+
Tree table
+--------------+------------------+------+---------------------------------------------+
| Field | Type | Key | Glossary |
+--------------+------------------+------+---------------------------------------------+
| Tr_species | C(10) | FK | (FK of species.sp_name |
| Tr_forest | C(10) | FK | (FK of forest.fo_name |
| Tr_numb | I | PK | Sequence number |
| Tr_planted | Date | | Date of planting |
| Tr_loc | C(10) | | Forest quadrant |
| Tr_parent | I | FK | (FK of tree.numb) procreating tree reference|
+--------------+------------------+------+---------------------------------------------+
C(10) & I stand for character (10) & Integer respectively
The
Tree
table is the connection between the Forest table and the Species table. Think of it as two steps:1) Starting from the Forest table, join to the
Tree
table (fromForest.Fo_name
toTree.Tr_forest
)2) Now that the Tree is known, join to the
Species
table (fromTree.species
toSpecies.sp_name
)I'd write the final query like this:
You could do multiple joins. Link the tree table to your main table forest, then link the species table:
The
ON
condition should compare columns from different tables.Then you just join each table step by step.
Try the SQL 99 Method
A F S are aliases used to make the SQL shorter and neater.
DISTINCT will remove duplicits.
Foreign keys are not needed to join tables!
So the answer to how to join tables when no FK is between them is to just join them.
The real question is how do we choose what tables to join (or combine in any other way).
Statements & tables
Every base table comes with a predicate--a statement template parameterized by column names. The table value is the rows that make its predicate into a true statement.
A query also has a predicate. Its value also is the rows that make its predicate true. Its predicate is built up according to its
FROM
,WHERE
and other clauses. A table alias names a table value like its base table but with columns prefixed by the alias. So its predicate is its base table's predicate using alias-prefixed columns.holds rows satisfying
(CROSS or INNER) JOIN puts AND between predicates; UNION puts OR between them; EXCEPT inserts AND NOT and ON & WHERE AND in a condition; SELECT renames, adds and drops columns. (Etc for other operators.) So
holds rows where
(No matter what the constraints are!) If you wanted only the rows above having forests named after their wood type then you would just add a condition via
... WHERE f.name=s.woodtype
because that makes the value be rows satisfying... AND f.name=s.woodtype
.That is a big informal predicate that the rows returned are to satisfy. If we try to rewrite it using only the predicates that we have been given plus AND, OR, AND NOT (etc) then we can only do it by
AND
ing all three given predicates (hence,JOIN
of the base table names) and addingAND Forest.loc='ARCTIC'
(hence,ON
orWHERE
that condition).FKs (etc) and querying (not)
PKs and FKs are special cases of integrity constraints. Given the predicates and what situations can arise, only some database values can arise. That's what integrity constraints describe. They let the DBMS keep out database values that shouldn't arise. (Also, optimize query execution.) Because a name is unique in Species we declare it a key. Ditto for Forest name and Tree numb. Because a species in Tree is a name in Species and name is a key of Species we declare FK Tree.species->Species.name. Ditto for forest and parent. Nothing to do with enabling joins. (Although they imply that a query result satisfies certain constraints, too.)
It does not matter to querying what the constraints are. If there were Tree species values that did not appear as any Species name value because the Species predicate was different then there would be no FK Tree.species->Species.name. But each query would continue to return the rows satisfying its predicate as expressed in terms of base table predicates. (Since the species predicate would be different, the query predicate would be different and its rows could be different.)
What determines query SQL
So the answer to how we choose what tables to join (or combine in any other way) is that we arrange base table names, JOIN, UNION, EXCEPT and WHERE (etc) as appropriate to give a query expression whose predicates is the one we want our rows to satisfy. This is usually taught as something informal to be done by feel but now you know what ties the SQL to the natural language. And constraints are irrelevant.
NOTE: The preceding assumes that we return no duplicates from queries. The reason why there are no duplicates in tables in the relational model is so that the above correspondence between table operators and logic connectives holds. However SQL tables can have duplicates. Where SQL differs from the relational model (in its many ways), there querying becomes less (literally) logical.