Why isn't SQL ANSI-92 standard better adopted

2018-12-31 14:26发布

At every company I have worked at, I have found that people are still writing their SQL queries in the ANSI-89 standard:

select a.id, b.id, b.address_1
from person a, address b
where a.id = b.id

rather than the ANSI-92 standard:

select a.id, b.id, b.address_1
from person a
inner join address b
on a.id = b.id

For an extremely simple query like this, there's not a big difference in readability, but for large queries I find that having my join criteria grouped in with listing out the table makes it much easier to see where I might have issues in my join, and let's me keep all my filtering in my WHERE clause. Not to mention that I feel that outer joins are much intuitive than the (+) syntax in Oracle.

As I try to evangelize ANSI-92 to people, are there any concrete performance benefits in using ANSI-92 over ANSI-89? I would try it on my own, but the Oracle setups we have here don't allow us to use EXPLAIN PLAN - wouldn't want people to try to optimize their code, would ya?

16条回答
看风景的人
2楼-- · 2018-12-31 14:50

I can't speak for all schools but at my university when we were doing the SQL module of our course, they didn't teach ANSI-92, they taught ANSI-89 - on an old VAX system at that! I wasn't exposed to ANSI-92 until I started digging around in Access having built some queries using the query designer and then digging into the SQL code. Realising I had no idea how it was completing the joins, or the implications of the syntax I started digging deeper so I could understand it.

Given that the available documentation isn't exactly intuitive in a lot of cases, and that people tend to stick to what they know and in many cases don't strive to learn any more than they need in order to get their job done, it's easy to see why adoption is taking so long.

Of course, there are those technical evangelists that like to tinker and understand and it tends to be those types that adopt the "newer" principles and try to convert the rest.

Oddly, it seems to me that a lot of programmers come out of school and stop advancing; thinking that because this is what they were taught, this is how it's done. It's not until you take off your blinkers that you realise that school was only meant to teach you the basics and give you enough understanding to learn the rest yourself and that really you barely scratched the surface of what there is to know; now it's your job to continue that path.

Of course, that's just my opinion based on my experience.

查看更多
刘海飞了
3楼-- · 2018-12-31 14:51

I don't know the answer for sure.. this is a religous war (albiet of a lesser degree than Mac-Pc or others)

A guess is that until fairly recently, Oracle, (and maybe other vendors as well) did not adopt the ANSI-92 standard (I think it was in Oracle v9, or thereabouts) and so, for DBAs/Db Developers working at companies which were still using these versions, (or wanted code to be portable across servers that might be using these versions, they had to stick to the old standard...

It's a shame really, because the new join syntax is much more readable, and the old syntax generates wrong (incorrect) results in several well-documented scenarios.

  • Specifically, outer Joins when there are conditional filtering predicates on non-Join related columns from the table on the "outer" side of the join.
查看更多
宁负流年不负卿
4楼-- · 2018-12-31 14:55

Well the ANSI092 standard includes some pretty heinous syntax. Natural Joins are one and the USING Clause is another. IMHO, the addition of a column to a table shouldn't break code but a NATURAL JOIN breaks in a most egregious fashion. The "best" way to break is by compilation error. For example if you SELECT * somewhere, the addition of a column could fail to compile. The next best way to fail would be a run time error. It's worse because your users may see it, but it still gives you a nice warning that you've broken something. If you use ANSI92 and write queries with NATURAL joins, it won't break at compile time and it won't break at run time, the query will just suddenly start producing wrong results. These types of bugs are insidious. Reports go wrong, potentially financial disclosure are incorrect.

For those unfamiliar with NATURAL Joins. They join two tables on every column name that exists in both tables. Which is really cool when you have a 4 column key and you're sick of typing it. The problem comes in when Table1 has a pre-existing column named DESCRIPTION and you add a new column to Table2 named, oh I don't know, something innocuous like, mmm, DESCRIPTION and now you're joining the two tables on a VARCHAR2(1000) field that is free form.

The USING clause can lead to total ambiguity in addition to the problem described above. In another SO post, someone showed this ANSI-92 SQL and asked for help reading it.

SELECT c.* 
FROM companies AS c 
JOIN users AS u USING(companyid) 
JOIN jobs AS j USING(userid) 
JOIN useraccounts AS us USING(userid) 
WHERE j.jobid = 123

This is completely ambiguous. I put a UserID column in both Companies and user tables and there's no complaint. What if the UserID column in companies is the ID of the last person to modify that row?

I'm serious, Can anyone explain why such ambiguity was necessary? Why is it built straight into the standard?

I think Bill is correct that there is a large base of developer who copy/paste there way through coding. In fact, I can admit that I'm kind of one when it comes to ANSI-92. Every example I ever saw showed multiple joins being nested in parentheses. Honesty, that makes picking out the tables in the sql difficult at best. But then an SQL92 evangilist explained that would actually force a join order. JESUS... all those Copy pasters I've seen are now actually forcing a join order - a job that's 95% of the time better left to optimizers especially a copy/paster.

Tomalak got it right when he said,

people don't switch to new syntax just because it is there

It has to give me something and I don't see an upside. And if there is an upside, the negatives are an albatross too big to be ignored.

查看更多
永恒的永恒
5楼-- · 2018-12-31 14:56

I had a query that was originally written for SQL Server 6.5, which did not support the SQL 92 join syntax, i.e.

select foo.baz
from foo
  left outer join bar
  on foo.a = bar.a

was instead written as

select foo.baz
from foo, bar
where foo.a *= bar.a

The query had been around for a while, and the relevant data had accumulated to make the query run too slow, abut 90 seconds to complete. By the time this problem arose, we had upgraded to SQL Server 7.

After mucking about with indexes and other Easter-egging, I changed the join syntax to be SQL 92 compliant. The query time dropped to 3 seconds.

There's a good reason to switch.

Reposted from here.

查看更多
君临天下
6楼-- · 2018-12-31 14:59

1) Standard way to write OUTER JOIN, versus *= or (+)=

2) NATURAL JOIN

3) Depend in the database engine, ANSI-92 trends to be more optimal.

4) Manual optimization :

Let's say that we have the next syntax (ANSI-89):

(1)select * from TABLE_OFFICES to,BIG_TABLE_USERS btu
where to.iduser=tbu.iduser and to.idoffice=1

It could be written as:

(2)select * from TABLE_OFFICES to
inner join BIG_TABLE_USERS btu on to.iduser=tbu.iduser
where to.idoffice=1

But also as :

(3)select * from TABLE_OFFICES to
inner join BIG_TABLE_USERS btu on to.iduser=tbu.iduser and to.idoffice=1

All of them (1),(2),(3) return the same result, however they are optimized differently, it depends in the database engine but most of them do :

  • (1) its up to the database engine decide the optimization.
  • (2) it joins both tables then do the filter per office.
  • (3) it filters the BIG_TABLE_USERS using the idoffice then join both tables.

5) Longer queries are less messy.

查看更多
不再属于我。
7楼-- · 2018-12-31 15:06

In response to the NATURAL JOIN and USING post above.

WHY would you ever see the need to use these - they weren't available in ANSI-89 and were added for ANSI-92 as what I can only see as a shortcut.

I would never leave a join to chance and would always specify the table/alias and id.

For me, the only way to go is ANSI-92. It is more verbose and the syntax isn't liked by ANSI-89 followers but it neatly separates your JOINS from your FILTERING.

查看更多
登录 后发表回答