Left join or select from multiple table using comm

2019-01-25 06:47发布

问题:

This question already has an answer here:

  • SQL left join vs multiple tables on FROM line? 11 answers

I'm curious as to why we need to use LEFT JOIN since we can use commas to select multiple tables.

What are the differences between LEFT JOIN and using commas to select multiple tables.

Which one is faster?

Here is my code:

   SELECT mw.*, 
          nvs.* 
     FROM mst_words mw 
LEFT JOIN (SELECT no as nonvs, 
                  owner, 
                  owner_no, 
                  vocab_no, 
                  correct 
             FROM vocab_stats 
            WHERE owner = 1111) AS nvs ON mw.no = nvs.vocab_no 
    WHERE (nvs.correct > 0 ) 
      AND mw.level = 1

...and:

SELECT * 
  FROM vocab_stats vs, 
       mst_words mw 
 WHERE mw.no = vs.vocab_no 
   AND vs.correct > 0 
   AND mw.level = 1 
   AND vs.owner = 1111

回答1:

First of all, to be completely equivalent, the first query should have been written

   SELECT mw.*, 
          nvs.* 
     FROM mst_words mw 
LEFT JOIN (SELECT *
             FROM vocab_stats 
            WHERE owner = 1111) AS nvs ON mw.no = nvs.vocab_no 
    WHERE (nvs.correct > 0 ) 
      AND mw.level = 1

So that mw.* and nvs.* together produce the same set as the 2nd query's singular *. The query as you have written can use an INNER JOIN, since it includes a filter on nvs.correct.

The general form

TABLEA LEFT JOIN TABLEB ON <CONDITION>

attempts to find TableB records based on the condition. If the fails, the results from TABLEA are kept, with all the columns from TableB set to NULL. In contrast

TABLEA INNER JOIN TABLEB ON <CONDITION>

also attempts to find TableB records based on the condition. However, when fails, the particular record from TableA is removed from the output result set.

The ANSI standard for CROSS JOIN produces a Cartesian product between the two tables.

TABLEA CROSS JOIN TABLEB
  -- # or in older syntax, simply using commas
TABLEA, TABLEB

The intention of the syntax is that EACH row in TABLEA is joined to EACH row in TABLEB. So 4 rows in A and 3 rows in B produces 12 rows of output. When paired with conditions in the WHERE clause, it sometimes produces the same behaviour of the INNER JOIN, since they express the same thing (condition between A and B => keep or not). However, it is a lot clearer when reading as to the intention when you use INNER JOIN instead of commas.

Performance-wise, most DBMS will process a LEFT join faster than an INNER JOIN. The comma notation can cause database systems to misinterpret the intention and produce a bad query plan - so another plus for SQL92 notation.

Why do we need LEFT JOIN? If the explanation of LEFT JOIN above is still not enough (keep records in A without matches in B), then consider that to achieve the same, you would need a complex UNION between two sets using the old comma-notation to achieve the same effect. But as previously stated, this doesn't apply to your example, which is really an INNER JOIN hiding behind a LEFT JOIN.

Notes:

  • The RIGHT JOIN is the same as LEFT, except that it starts with TABLEB (right side) instead of A.
  • RIGHT and LEFT JOINS are both OUTER joins. The word OUTER is optional, i.e. it can be written as LEFT OUTER JOIN.
  • The third type of OUTER join is FULL OUTER join, but that is not discussed here.


回答2:

Separating the JOIN from the WHERE makes it easy to read, as the join logic cannot be confused with the WHERE conditions. It will also generally be faster as the server will not need to conduct two separate queries and combine the results.

The two examples you've given are not really equivalent, as you have included a sub-query in the first example. This is a better example:

SELECT vs.*, mw.*
FROM vocab_stats vs, mst_words mw
LEFT JOIN vocab_stats vs ON mw.no = vs.vocab_no
WHERE vs.correct > 0 
AND mw.level = 1
AND vs.owner = 1111