Is mysqli::multi_query more efficient than several

2020-02-09 11:50发布

Someone brought up the MySQLi multi_query function in an answer claiming that it would be better than looping through 3 separate queries. I tried to Google some sort of answer to this but didn't really get anything that really satisfied my curiosity so I hope you guys may have better insights into the reasons for using it rather than saving a few lines of code.

So here's what I am wondering:

  1. What is multi_query doing under the hood?
  2. Does multi_query simply hit the server x number of times and aggregates the results?
  3. Is there a case where single queries may be more efficient than multiple queries?

I know that hitting the database 3 times for a million items each and smashing it into one huge object isn't good for memory usage, but I know that there has to be a reason for its existence and I am also sure there are times when it should be avoided. I hope to better understand it so that I can put that into my bag of tricks when the need arises.

Thanks for your time!

标签: php mysqli
2条回答
成全新的幸福
2楼-- · 2020-02-09 12:35
  • What is multi_query doing under the hood? - Just sending all the queries to the server at once instead of one at a time, and retrieving all the results in one go. Nothing more complicated than that.

  • Does multi_query simply hit the server x number of times and aggregates the results? - It "hits" the server twice - once to send the queries and once to retrieve the results.

  • Is there a case where single queries may be more efficient than multiple queries? - depends how you define "efficient". multi_query() is light on the network but memory heavy, running query() in a loop vice versa.

    For many SELECT statements that return large result sets the loss in memory consumption is likely to vastly outweigh the gain in terms of the network and most of the time you'd be better to issue the queries and process the result sets one at a time - although this depends on exactly what you are doing with the data. But if you needed to run many UPDATE statements, it is likely that multi_query() would be better since the return values are just success/fail and the memory consumption will be light.

    You would have to weigh up all the factors like what you are doing, how long you expect it to take, the network latency between the (database) server and client, the available resources (mostly memory) on the server and the client, etc, etc... and take it on a case by case basis.

I found this record of some performance testing done a while ago, where the conclusion is that there is an overall efficiency gain to be found from using multi_query(). However the test case was simply running 4 queries, each one SELECTing a single result, and the definition of "more efficient" is simply "faster". There is no testing for larger numbers of queries or larger result sets and, while speed is important, it is not the be-all and end-all - I can make anything run incredibly fast if I give it an unlimited amount of memory, but an attempt to do anything concurrently will fail miserably. It is also not a real world test, since the end result could be achieved with a single JOINed query. It does make for some interesting read though.

Personally I feel this is somewhat academic because if you are running a large set of statements at once, 90% of the time they will vary only in the data that is being passed and the query structure will remain the same - which is an obvious candidate for prepared statements.

查看更多
家丑人穷心不美
3楼-- · 2020-02-09 12:41

The answer above is wrong.

What is multi_query doing under the hood?

It just does send all the queries to the server at once, while retrieving the result for the first one only. So, in effect, multi_query is a sort of wrapper for the asynchronous query execution.

Does multi_query simply hit the server x number of times and aggregates the results?

multi_query hits the server only once. All the other hits you have to do manually, by calling next_result, in order to get all the other queries' results.

Is there a case where single queries may be more efficient than multiple queries?

this question assumes that there is a case when multi_query is faster. Which is a questionable statement. For the usual query, the network latency is a negligible part of the whole execution time. If you're really concerned in the speed by such margin, take a look a the HandlerSocket - it will be blazing fast. However, for a regular development such a difference between single and multi query would be your least concern. If you care for the real life matters, not imaginary ones.

查看更多
登录 后发表回答