ANSI vs. non-ANSI SQL JOIN syntax

2018-12-31 04:30发布

I have my business-logic in ~7000 lines of T-SQL stored procedures, and most of them has next JOIN syntax:

SELECT A.A, B.B, C.C
FROM aaa AS A, bbb AS B, ccc AS C
WHERE
    A.B = B.ID
AND B.C = C.ID
AND C.ID = @param

Will I get performance growth if I will replace such query with this:

SELECT A.A, B.B, C.C
FROM aaa AS A
JOIN bbb AS B
   ON A.B = B.ID
JOIN ccc AS C
   ON B.C = C.ID
   AND C.ID = @param

Or they are the same?

7条回答
爱死公子算了
2楼-- · 2018-12-31 05:16

Execute both and check their query plans. They should be equal.

查看更多
零度萤火
3楼-- · 2018-12-31 05:17

The second construct is known as the "infixed join syntax" in the SQL community. The first construct AFAIK doesn't have widely accepted name so let's call it the 'old style' inner join syntax.

The usual arguments go like this:

Pros of the 'Traditional' syntax: the predicates are physically grouped together in the WHERE clause in whatever order which makes the query generally, and n-ary relationships particularly, easier to read and understand (the ON clauses of the infixed syntax can spread out the predicates so you have to look for the appearance of one table or column over a visual distance).

Cons of the 'Traditional' syntax: There is no parse error when omitting one of the 'join' predicates and the result is a Cartesian product (known as a CROSS JOIN in the infixed syntax) and such an error can be tricky to detect and debug. Also, 'join' predicates and 'filtering' predicates are physically grouped together in the WHERE clause, which can cause them to be confused for one another.

查看更多
低头抚发
4楼-- · 2018-12-31 05:20

In my mind the FROM clause is where I decide what columns I need in the rows for my SELECT clause to work on. It is where a business rule is expressed that will bring onto the same row, values needed in calculations. The business rule can be customers who have invoices, resulting in rows of invoices including the customer responsible. It could also be venues in the same postcode as clients, resulting in a list of venues and clients that are close together.

It is where I work out the centricity of the rows in my result set. After all, we are simply shown the metaphor of a list in RDBMSs, each list having a topic (the entity) and each row being an instance of the entity. If the row centricity is understood, the entity of the result set is understood.

The WHERE clause, which conceptually executes after the rows are defined in the from clause, culls rows not required (or includes rows that are required) for the SELECT clause to work on.

Because join logic can be expressed in both the FROM clause and the WHERE clause, and because the clauses exist to divide and conquer complex logic, I choose to put join logic that involves values in columns in the FROM clause because that is essentially expressing a business rule that is supported by matching values in columns.

i.e. I won't write a WHERE clause like this:

 WHERE Column1 = Column2

I will put that in the FROM clause like this:

 ON Column1 = Column2

Likewise, if a column is to be compared to external values (values that may or may not be in a column) such as comparing a postcode to a specific postcode, I will put that in the WHERE clause because I am essentially saying I only want rows like this.

i.e. I won't write a FROM clause like this:

 ON PostCode = '1234'

I will put that in the WHERE clause like this:

 WHERE PostCode = '1234'
查看更多
ら面具成の殇う
5楼-- · 2018-12-31 05:22

The two queries are the same, except the second is ANSI-92 SQL syntax and the first is the older SQL syntax which didn't incorporate the join clause. They should produce exactly the same internal query plan, although you may like to check.

You should use the ANSI-92 syntax for several of reasons

  • The use of the JOIN clause separates the relationship logic from the filter logic (the WHERE) and is thus cleaner and easier to understand.
  • It doesn't matter with this particular query, but there are a few circumstances where the older outer join syntax (using + ) is ambiguous and the query results are hence implementation dependent - or the query cannot be resolved at all. These do not occur with ANSI-92
  • It's good practice as most developers and dba's will use ANSI-92 nowadays and you should follow the standard. Certainly all modern query tools will generate ANSI-92.
  • As pointed out by @gbn, it does tend to avoid accidental cross joins.

Myself I resisted ANSI-92 for some time as there is a slight conceptual advantage to the old syntax as it's a easier to envisage the SQL as a mass Cartesian join of all tables used followed by a filtering operation - a mental technique that can be useful for grasping what a SQL query is doing. However I decided a few years ago that I needed to move with the times and after a relatively short adjustment period I now strongly prefer it - predominantly because of the first reason given above. The only place that one should depart from the ANSI-92 syntax, or rather not use the option, is with natural joins which are implicitly dangerous.

查看更多
何处买醉
6楼-- · 2018-12-31 05:24

ANSI syntax does enforce neither predicate placement in the proper clause (be that ON or WHERE), nor the affinity of the ON clause to adjacent table reference. A developer is free to write a mess like this

SELECT
   C.FullName,
   C.CustomerCode,
   O.OrderDate,
   O.OrderTotal,
   OD.ExtendedShippingNotes
FROM
   Customer C
   CROSS JOIN Order O
   INNER JOIN OrderDetail OD
      ON C.CustomerID = O.CustomerID
      AND C.CustomerStatus = 'Preferred'
      AND O.OrderTotal > 1000.0
WHERE
   O.OrderID = OD.OrderID;

Speaking of query tools who "will generate ANSI-92", I'm commenting here because it generated

SELECT 1
   FROM DEPARTMENTS C
        JOIN EMPLOYEES A
             JOIN JOBS B
     ON C.DEPARTMENT_ID = A.DEPARTMENT_ID
     ON A.JOB_ID = B.JOB_ID

The only syntax that escapes conventional "restrict-project-cartesian product" is outer join. This operation is more complicated because it is not associative (both with itself and with normal join). One have to judiciously parenthesize query with outer join, at least. However, it is an exotic operation; if you are using it too often I suggest taking relational database class.

查看更多
荒废的爱情
7楼-- · 2018-12-31 05:25

The two queries are equal - the first is using non-ANSI JOIN syntax, the 2nd is ANSI JOIN syntax. I recommend sticking with the ANSI JOIN syntax.

And yes, LEFT OUTER JOINs (which, btw are also ANSI JOIN syntax) are what you want to use when there's a possibility that the table you're joining to might not contain any matching records.

Reference: Conditional Joins in SQL Server

查看更多
登录 后发表回答