What is a SQL JOIN
and what are different types?
相关问题
- SQL join to get the cartesian product of 2 columns
- sql execution latency when assign to a variable
- Difference between Types.INTEGER and Types.NULL in
- php PDO::FETCH_ASSOC doesnt detect select after ba
- Bulk update SQL Server C#
An illustration from W3schools:
What is
SQL JOIN
?SQL JOIN
is a method to retrieve data from two or more database tables.What are the different
SQL JOIN
s ?There are a total of five
JOIN
s. They are :1. JOIN or INNER JOIN :
In this kind of a
JOIN
, we get all records that match the condition in both tables, and records in both tables that do not match are not reported.In other words,
INNER JOIN
is based on the single fact that: ONLY the matching entries in BOTH the tables SHOULD be listed.Note that a
JOIN
without any otherJOIN
keywords (likeINNER
,OUTER
,LEFT
, etc) is anINNER JOIN
. In other words,JOIN
is a Syntactic sugar forINNER JOIN
(see: Difference between JOIN and INNER JOIN).2. OUTER JOIN :
OUTER JOIN
retrievesEither, the matched rows from one table and all rows in the other table Or, all rows in all tables (it doesn't matter whether or not there is a match).
There are three kinds of Outer Join :
2.1 LEFT OUTER JOIN or LEFT JOIN
This join returns all the rows from the left table in conjunction with the matching rows from the right table. If there are no columns matching in the right table, it returns
NULL
values.2.2 RIGHT OUTER JOIN or RIGHT JOIN
This
JOIN
returns all the rows from the right table in conjunction with the matching rows from the left table. If there are no columns matching in the left table, it returnsNULL
values.2.3 FULL OUTER JOIN or FULL JOIN
This
JOIN
combinesLEFT OUTER JOIN
andRIGHT OUTER JOIN
. It returns rows from either table when the conditions are met and returnsNULL
value when there is no match.In other words,
OUTER JOIN
is based on the fact that: ONLY the matching entries in ONE OF the tables (RIGHT or LEFT) or BOTH of the tables(FULL) SHOULD be listed.3. NATURAL JOIN :
It is based on the two conditions :
JOIN
is made on all the columns with the same name for equality.This seems to be more of theoretical in nature and as a result (probably) most DBMS don't even bother supporting this.
4. CROSS JOIN :
It is the Cartesian product of the two tables involved. The result of a
CROSS JOIN
will not make sense in most of the situations. Moreover, we won't need this at all (or needs the least, to be precise).5. SELF JOIN :
It is not a different form of
JOIN
, rather it is aJOIN
(INNER
,OUTER
, etc) of a table to itself.JOINs based on Operators
Depending on the operator used for a
JOIN
clause, there can be two types ofJOIN
s. They are1. Equi JOIN :
For whatever
JOIN
type (INNER
,OUTER
, etc), if we use ONLY the equality operator (=), then we say that theJOIN
is anEQUI JOIN
.2. Theta JOIN :
This is same as
EQUI JOIN
but it allows all other operators like >, <, >= etc.I have created an illustration that explains better than words, in my opinion:
Interestingly most other answers suffer from these two problems:
I've recently written an article on the topic: A Probably Incomplete, Comprehensive Guide to the Many Different Ways to JOIN Tables in SQL, which I'll summarise here.
First and foremost: JOINs are cartesian products
This is why Venn diagrams explain them so inaccurately, because a JOIN creates a cartesian product between the two joined tables. Wikipedia illustrates it nicely:
The SQL syntax for cartesian products is
CROSS JOIN
. For example:Which combines all rows from one table with all rows from the other table:
Source:
Result:
If we just write a comma separated list of tables, we'll get the same:
INNER JOIN (Theta-JOIN)
An
INNER JOIN
is just a filteredCROSS JOIN
where the filter predicate is calledTheta
in relational algebra.For instance:
Note that the keyword
INNER
is optional (except in MS Access).(look at the article for result examples)
EQUI JOIN
A special kind of Theta-JOIN is equi JOIN, which we use most. The predicate joins the primary key of one table with the foreign key of another table. If we use the Sakila database for illustration, we can write:
This combines all actors with their films.
Or also, on some databases:
The
USING()
syntax allows for specifying a column that must be present on either side of a JOIN operation's tables and creates an equality predicate on those two columns.NATURAL JOIN
Other answers have listed this "JOIN type" separately, but that doesn't make sense. It's just a syntax sugar form for equi JOIN, which is a special case of Theta-JOIN or INNER JOIN. NATURAL JOIN simply collects all columns that are common to both tables being joined and joins
USING()
those columns. Which is hardly ever useful, because of accidental matches (likeLAST_UPDATE
columns in the Sakila database).Here's the syntax:
OUTER JOIN
Now,
OUTER JOIN
is a bit different fromINNER JOIN
as it creates aUNION
of several cartesian products. We can write:No one wants to write the latter, so we write
OUTER JOIN
(which is usually better optimised by databases).Like
INNER
, the keywordOUTER
is optional, here.OUTER JOIN
comes in three flavours:LEFT [ OUTER ] JOIN
: The left table of theJOIN
expression is added to the union as shown above.RIGHT [ OUTER ] JOIN
: The right table of theJOIN
expression is added to the union as shown above.FULL [ OUTER ] JOIN
: Both tables of theJOIN
expression are added to the union as shown above.All of these can be combined with the keyword
USING()
or withNATURAL
(I've actually had a real world use-case for aNATURAL FULL JOIN
recently)Alternative syntaxes
There are some historic, deprecated syntaxes in Oracle and SQL Server, which supported
OUTER JOIN
already before the SQL standard had a syntax for this:Having said so, don't use this syntax. I just list this here so you can recognise it from old blog posts / legacy code.
Partitioned
OUTER JOIN
Few people know this, but the SQL standard specifies partitioned
OUTER JOIN
(and Oracle implements it). You can write things like this:Parts of the result:
The point here is that all rows from the partitioned side of the join will wind up in the result regardless if the
JOIN
matched anything on the "other side of the JOIN". Long story short: This is to fill up sparse data in reports. Very useful!SEMI JOIN
Seriously? No other answer got this? Of course not, because it doesn't have a native syntax in SQL, unfortunately (just like ANTI JOIN below). But we can use
IN()
andEXISTS()
, e.g. to find all actors who have played in films:The
WHERE a.actor_id = fa.actor_id
predicate acts as the semi join predicate. If you don't believe it, check out execution plans, e.g. in Oracle. You'll see that the database executes a SEMI JOIN operation, not theEXISTS()
predicate.ANTI JOIN
This is just the opposite of SEMI JOIN (be careful not to use
NOT IN
though, as it has an important caveat)Here are all the actors without films:
Some folks (especially MySQL people) also write ANTI JOIN like this:
I think the historic reason is performance.
LATERAL JOIN
OMG, this one is too cool. I'm the only one to mention it? Here's a cool query:
It will find the TOP 5 revenue producing films per actor. Every time you need a TOP-N-per-something query,
LATERAL JOIN
will be your friend. If you're a SQL Server person, then you know thisJOIN
type under the nameAPPLY
OK, perhaps that's cheating, because a
LATERAL JOIN
orAPPLY
expression is really a "correlated subquery" that produces several rows. But if we allow for "correlated subqueries", we can also talk about...MULTISET
This is only really implemented by Oracle and Informix (to my knowledge), but it can be emulated in PostgreSQL using arrays and/or XML and in SQL Server using XML.
MULTISET
produces a correlated subquery and nests the resulting set of rows in the outer query. The below query selects all actors and for each actor collects their films in a nested collection:As you have seen, there are more types of JOIN than just the "boring"
INNER
,OUTER
, andCROSS JOIN
that are usually mentioned. More details in my article. And please, stop using Venn diagrams to illustrate them.I'm going to push my pet peeve: the USING keyword.
If both tables on both sides of the JOIN have their foreign keys properly named (ie, same name, not just "id) then this can be used:
I find this very practical, readable, and not used often enough.
Definition:
JOINS are way to query the data that combined together from multiple tables simultaneously.
Types of JOINS:
Concern to RDBMS there are 5-types of joins:
Equi-Join: Combines common records from two tables based on equality condition. Technically, Join made by using equality-operator (=) to compare values of PrimaryKey of one table and Foriegn Key values of antoher table, hence result set includes common(matched) records from both tables. For implementation see INNER-JOIN.
Natural-Join: It is enhanced version of Equi-Join, in which SELECT operation omits duplicate column. For implementation see INNER-JOIN
Non-Equi-Join: It is reverse of Equi-join where joining condition is uses other than equal operator(=) e.g, !=, <=, >=, >, < or BETWEEN etc. For implementation see INNER-JOIN.
Self-Join:: A customized behavior of join where a table combined with itself; This is typically needed for querying self-referencing tables (or Unary relationship entity). For implementation see INNER-JOINs.
Cartesian Product: It cross combines all records of both tables without any condition. Technically, it returns result set of a query without WHERE-Clause.
As per SQL concern and advancement, there are 3-types of joins and all RDBMS joins can be achvied using these types of joins.
INNER-JOIN: It merges(or combiens) matched rows from two tables. The matching is done based on common columns of tables and their comparing operation. If equaility based condition then: EQUI-JOIN performed, otherwise Non-EQUI-Join.
**OUTER-JOIN:**It merges(or combines) matched rows from two tables and unmatched rows with NULL values. However, can customized selection of un-matched rows e.g, selecting unmatched row from first table or second table by sub-types: LEFT OUTER JOIN and RIGHT OUTER JOIN.
2.1. LEFT Outer JOIN (a.k.a, LEFT-JOIN): Returns matched rows form two tables and unmached from LEFT table(i.e, first table) only.
2.2. RIGHT Outer JOIN (a.k.a, RIGHT-JOIN): Returns matched rows from two tables and unmatched from RIGHT table only.
2.3. FULL OUTER JOIN (a.k.a OUTER JOIN): Returns matched and unmatched from both tables.
CROSS-JOIN: This join does not merges/combiens instead it performs cartisian product.
Note: Self-JOIN can be achived by either INNER-JOIN, OUTER-JOIN and CROSS-JOIN based on requirement but table must join with itself.
For more information:
Examples:
1.1: INNER-JOIN: Equi-join implemetation
1.2: INNER-JOIN: Natural-JOIN implementation
1.3: INNER-JOIN with NON-Eqijoin implementation
1.4: INNER-JOIN with SELF-JOIN
2.1: OUTER JOIN (full outer join)
2.2: LEFT JOIN
2.3: RIGHT JOIN
3.1: CROSS JOIN
3.2: CROSS JOIN-Self JOIN
//OR//