Pros / Cons of MySql JOINS

2020-06-03 05:10发布

问题:

When I'm selecting data from multiple tables I used to use JOINS a lot and recently I started to use another way but I'm unsure of the impact in the long run.

Examples:

SELECT * FROM table_1 LEFT JOIN table_2 ON (table_1.column = table_2.column)

So this is your basic LEFT JOIN across tables but take a look at the query below.

SELECT * FROM table_1,table_2 WHERE table_1.column = table_2.column

Personally if I was joining across lets say 7 tables of data I would prefer to do this over JOINS.

But are there any pros and cons in regards to the 2 methods ?

回答1:

Second method is a shortcut for INNER JOIN.

 SELECT * FROM table_1 INNER JOIN table_2 ON table_1.column = table_2.column

Will only select records that match the condition in both tables (LEFT JOIN will select all records from table on the left, and matching records from table on the right)

Quote from http://dev.mysql.com/doc/refman/5.0/en/join.html

[...] we consider each comma in a list of table_reference items as equivalent to an inner join

And

INNER JOIN and , (comma) are semantically equivalent in the absence of a join condition: both produce a Cartesian product between the specified tables (that is, each and every row in the first table is joined to each and every row in the second table).

However, the precedence of the comma operator is less than of INNER JOIN, CROSS JOIN, LEFT JOIN, and so on. If you mix comma joins with the other join types when there is a join condition, an error of the form Unknown column 'col_name' in 'on clause' may occur. Information about dealing with this problem is given later in this section.

In general there are quite a few things mentioned there, that should make you consider not using commas.



回答2:

The first method is the ANSI/ISO version of the Join. The second method is the older format (pre-89) to produce the equivalent of an Inner Join. It does this by cross joining all the tables you list and then narrowing the Cartesian product in the Where clause to produce the equivalent of an inner join.

I would strongly recommend against the second method.

  1. It is harder for other developers to read
  2. It breaks the rule of least astonishment to other developers who will wonder whether you simply did not know any better or if there was some specific reason for not using the ANSI/ISO format.
  3. It will cause you grief when you start trying to use that format with something other than Inner Joins.
  4. It makes it harder to discern your intent especially in a large query with many tables. Are all of these tables supposed to be inner joins? Did you miss something in the Where clause and create a cross join? Did you intend to make a cross join? Etc.

There is simply no reason to use the second format and in fact many database systems are ending support for that format.



回答3:

ANSI Syntax

Both queries are JOINs, and both use ANSI syntax but one is older than the other.

Joins using with the JOIN keyword means that ANSI-92 syntax is being used. ANSI-89 syntax is when you have tables comma separated in the FROM clause, and the criteria that joins them is found in the WHERE clause. When comparing INNER JOINs, there is no performance difference - this:

SELECT * 
  FROM table_1 t1, table_2 t2
 WHERE t1.column = t2.column

...will produce the same query plan as:

SELECT *
  FROM TABLE_1 t1
  JOIN TABLE_2 t2 ON t2.column = t1.column

Apples to Oranges

Another difference is that the two queries are not identical - a LEFT [OUTER] JOIN will produce all rows from TABLE_1, and references to TABLE_2 in the output will be NULL if there's no match based on the JOIN criteria (specified in the ON clause). The second example is an INNER JOIN, which will only produce rows that have matching records in TABLE_2. Here's a link to a visual representation of JOINs to reinforce the difference...

Pros/Cons

The main reason to use ANSI-92 syntax is because ANSI-89 doesn't have any OUTER JOIN (LEFT, RIGHT, FULL) support. ANSI-92 syntax was specifically introduced to address this shortcoming, because vendors were implementing their own, custom syntax. Oracle used (+); SQL Server used an asterisk on the side of the equals in the join criteria (IE: t1.column =* t2.column).

The next reason to use ANSI-92 syntax is that it's more explicit, more readable, while separating what is being used for joining tables vs actual filteration.



回答4:

I personally feel the explicit join syntax (A JOIN B, A LEFT JOIN B) is preferable. Both because it's more explicit about what you're doing, and because if you use implicit join syntax for inner joins, you still have to use the explicit syntax for outer joins and thus your SQL formatting will be inconsistent.