SQLAlchemy raw sql vs expression language statemen

2019-03-26 07:23发布

问题:

When inserting multiple rows in a MySQL-DB via a SQLA-Expression-Language statement, f.e.

Foo.__table__.insert().execute([{'bar': 1}, {'bar': 2}, {'bar': 3}])

it´s extremly slow, when compared to the execution of a "raw" sql statement for the same task, i.e.

engine.execute("insert into foo (bar) values (1),(2),(3)")

What is the reason for this? Can´t SQLA generate a single bulk insert statement and therefore executes multiple inserts? Due to the speed limits of the orm, i need a fast way to add several thousand rows at once, but the SQLA-Expression-Language-Version is too slow. So, do i need to write the raw sql by myself? The documentation isn't too clear about this.

I ran a speed test with the ORM insert, the ORM with preassigned PK and the SQLA bulk insert (see SQLA bulk insert speed) like this (https://gist.github.com/3341940):

  • SqlAlchemy ORM: Total time for 500 records 9.61418914795 secs
  • SqlAlchemy ORM pk given: Total time for 500 records 9.56391906738 secs
  • SqlAlchemy Core: Total time for 500 records 9.5362598896 secs
  • SQLAlchemy RAW String Execution: Total time for 500 records 1.233677 secs

As you can see, there is practically no difference between the three versions. Only the execution of a raw string insert, where all the records are included in the raw sql statement is significantly faster. Thus, for fast inserts, SQLA seems sub-optimal.

回答1:

It seems that the special INSERT with multiple values only became recently supported(0.8 unreleased), you can see the note at the bottom of this section regarding the difference between executemany(what execute with a list does) and a multiple-VALUES INSERT:

http://docs.sqlalchemy.org/ru/latest/core/expression_api.html#sqlalchemy.sql.expression.Insert.values

This should explain the performance difference you see. You could try installing the development version and repeating the tests with the altered calling syntax, mentioned in the link, to confirm.