SQL portability gotchas

2019-03-11 07:48发布

站内文章 / 后端开发

73 0

迷人小祖宗

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

My company has me working on finishing a back end for Oracle for a Python ORM. I'm amazed at how much differently RDBMSes do things even for the simple stuff. I've learned a lot about the differences between Oracle and other RDBMSes. Just out of sheer curiosity, I'd like to learn more.

What are some common "gotchas" in terms of porting SQL from one platform to another?

Please, only one gotcha per answer.

回答1:

Oracle does not seem to have a problem with cursors, they are a huge performance problem in SQL server.

Actually pretty much all performance tuning is database specific (which is why ANSII standard code often performs very poorly compared to the better methods designed into the specifc flavor of SQL that is database specific).

Dates are another thing that seem to be handled very differntly from database to database.

Datatypes are not equivalent either. One thing that tends to get newcomers to SQL Server is that timestamp data type has absolutely nothing to do with dates and times and cannot be converted to a datatime value.

回答2:

Oracle does not allow a select statement without a FROM clause. Therefore, you cannot do queries like this:

SELECT 1

Instead, you have to say that the query is from the DUAL table:

SELECT 1 FROM DUAL

回答3:

Another example is generating unique (typically surrogate) primary keys.

Many databases, such as SQL Server and sqlite allow a column to be declared as an identity: typically, if the value for this column is missing on insert, the database will generate a unique value for the column.

Oracle, by contrast, has you create a sequence separate to the table and then use nextval on the sequence to generate the next value:

CREATE SEQUENCE test_seq;
SELECT test_seq.nextval FROM dual;

Or, more typically:

INSERT INTO foo(id, title) VALUES (test_seq.nextval, 'bar');

回答4:

Multi-value IN clause queries. I used to use these all the time on Oracle and was surprised to find that you can't do this in SQL Server. For example, this query:

SELECT * FROM mytable WHERE (col1, col2) IN ( SELECT col1, col2 FROM othertable )

回答5:

Oracle will not allow you to insert empty strings: they are silently converted to NULL.

回答6:

Oracle has a different approach to quotes opposed to MySQL.

MySQL: `object_name`, 'string', "string"
Oracle: "object_name", 'string'

Furthermore, escaping is different.

MySQL: 'It\'s easy'
Oracle: 'It''s slightly confusing'

(Note that to escape anything other than quotes in Oracle, you can use the ESCAPE directive in your query; SELECT * FROM testTable WHERE percent = '50\%' ESCAPE '\')

回答7:

What are some common "gotchas" in terms of porting SQL from one platform to another?

Same as attempts to translate from English to Russian by substituting the dictionary words right into the phrases.

Works for hello and goodbye, but fails for Mary had a little lamb to say nothing of Shakespeare.

Different RDBMS's have different cultures, despite having SQL in their name.

For instance, the row limiting.

In Oracle:

WHERE rownum = 1

In SQL Server:

SELECT TOP 1

In MySQL and PostgreSQL:

LIMIT 1

In DB2:

SELECT * ... FETCH FIRST 1 ROW ONLY

Four different clauses.

回答8:

A list of discrepancies between SQL Server and Oracle that I have encountered when porting:

Think ANSI Standard SQL Is Fully Portable Between Databases? Think Again.

Writing ANSI Standard SQL is not practical

回答9:

Performance issues is a big one. For instance, views in Oracle are, AFAIK, pretty much as fast as tables. This was not the case with SQLServer when I had to use them. The views effectively killed the performance slowing down the same selects by an order of magnitude or more (a query straight from the tables took, say 0.5s, while using a view could take a minute). There were also a lot of limitations on their use, for instance not all SQL functions could be used on views.

Note that this was true 5-6 six years ago, I don't know if Microsoft has improved this since.

回答10:

Obscure join syntax, like Oracle's (+) syntax for outer joins. At a company I used to work for this syntax was used all over the place rather than the standard LEFT OUTER JOIN / LEFT JOIN syntax, which made porting some stuff over to MySQL quite a pain.

回答11:

Oracle does not allow you to have multiple inserts in one query. MySQL allows this:

INSERT INTO test(id, name) VALUES (1, 'foo'),(2, 'bar');

回答12:

Temporary Tables - Oracle vs SQL Server/MySQL. Transitioning from Oracle to MS/MySQL, no problem. The other way around, a little different.

回答13:

For the big picture on product specific gotchas, you need to learn the difference between logical database design and physical database design.

Logical database design mostly has to do with features of tables. Features of tables include columns and constraints. Although tables themselves are physical, table design is generally very portable from one database system to another. There are differences in the way some datatypes work, and a few differences in syntax, like whether the underscore can be used in a table name or not. But a good logical design should port from one system to another with only minor changes or no changes.

Physical database design mostly has to do with features of the infrastructure on which the table structure rests. Almost all systems support indexes, and the default index type is B-tree although it might be called something else. But from there on out, each system has its own physical features, which can be completely different from one system to another. A typical physical feature of Oracle is tablespaces. Closely associated with tablespaces is the mapping between tables and tablespaces. Physical design has to be done on a system specific basis.

In addition to which RDM system you are using, you need to take data volume, load, response time requirements, and system resources like disks into account in your design. The good news is that a lot of changes to physical design can be made with no changes to application code. This is known as physical data independence. This means that you are somewhat free to tweak and tune the physical desing after you've got some application code written and some data loaded.

You might want to look at some books on database design to get a deeper idea of logical and physical design and the difference between them. Some popular authors are C.J. Date and Joe Celko.

回答14:

I recall a peculiar Oracle problem that got me completely off gueard. I'm not sure it this was a configuration of the instance or a default settings, but we could not have more that 1000 elements in a IN statement. So we had to trick it into doing what wanted:

SELECT Col1,Col2 
FROM Table
WHERE Code IN (1,2,3,...,1000)
OR Code IN (1001,1002,1003,...,2000)

etc.

Ugly, but it worked.

(Before anyone points out the obvious solution of a sub-query or inline view, the query was generated on a completely different system)

回答15:

Different databases handle binary data a bit differently. So for instance, this will work under MySQL:

mysql> CREATE TABLE t (c BINARY(3));
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO t SET c = 'z';
Query OK, 1 row affected (0.01 sec)

However, Oracle relies on these values being hex:

SQL> CREATE TABLE t (c RAW(3));

Table created.

SQL> INSERT INTO t VALUES ('z');
INSERT INTO t VALUES ('z')
                 *
ERROR at line 1:
ORA-01465: invalid hex number

Instead, we have to convert it to hex:

SQL> INSERT INTO t VALUES (rawtohex('z'));

1 row created.

回答16:

Set operator support.

Aside from UNION / UNION ALL, set operator support is pretty spotty across databases. Oracle and SQL server support most of them, but Oracle supports a MINUS operation as well as the equivalent standard EXCEPT DISTINCT operation. AFIK, MySQL only supports UNION (no INTERSECT or EXCEPT support).

回答17:

I'm not sure if this is true in the drivers that come with Python, but in the version of "Horrable" I used, if a column in a resultset returns NULL for all values in the column, the column itself is not returned in the structure of the resultset. This could (and often did) result in production bugs that were impossible to duplicate in development.

Since you're moving away from Oracle, this shouldn't be a concern, but there is a keyword developers use to mitigate this "feature" that I doubt is supported in MySQL. I forget what it is, though, and Google is not helping.