Poor whereHas performance in Laravel

2020-05-17 11:22发布

问题:

I want to apply a where condition to relation. Here's what I do:

Replay::whereHas('players', function ($query) {
    $query->where('battletag_name', 'test');
})->limit(100);

It generates the following query:

select * from `replays` 
where exists (
    select * from `players` 
    where `replays`.`id` = `players`.`replay_id` 
      and `battletag_name` = 'test') 
order by `id` asc 
limit 100;

Which executes in 70 seconds. If I manually rewrite query like this:

select * from `replays` 
where id in (
    select replay_id from `players` 
    where `battletag_name` = 'test') 
order by `id` asc 
limit 100;

It executes in 0.4 seconds. Why where exists is the default behavior if it's so slow? Is there a way to generate the correct where in query with query builder or do I need to inject raw SQL? Maybe I'm doing something wrong altogether?

replays table has 4M rows, players has 40M rows, all relevant columns are indexed, dataset doesn't fit into MySQL server memory.

Update: found that the correct query can be generated as:

Replay::whereIn('id', function ($query) {
    $query->select('replay_id')->from('players')->where('battletag_name', 'test');
})->limit(100);

Still have a question why exists performs so poorly and why it is the default behavior

回答1:

This is related to the mysql not to the laravel. You can perform the same thing you wanted from the above with the both options, joins and the subqueries. Subqueries are generally much slower than joins.

Subqueries are:

  • less complicated
  • elegant
  • easier to understand
  • easier to write
  • logic separation

and the above facts are why ORMs like eloquent are using suquries. but there are slower! Especially when you have many rows in the database.

Join version of your query is something like this :

select * from `replays`
join `players` on `replays`.`id` = `players`.`replay_id` 
and `battletag_name` = 'test'
order by `id` asc 
limit 100;

but now you must change select and add group by and be careful on many other things, but why is this so it is beyond that answer. New query would be :

select replays.* from `replays`
join `players` on `replays`.`id` = `players`.`replay_id` 
and `battletag_name` = 'test'
order by `id` asc 
group by replays.id
limit 100;

So that are the reasons why join in more complicated.

You can write raw query in laravel, but eloquent support for join queries are not well supported, also there are no much packages that can help you with that, this one is for example : https://github.com/fico7489/laravel-eloquent-join



回答2:

I think performance does not depend on whereHas only it depends on how many records you have selected

Plus try to optimize your mysql server

https://dev.mysql.com/doc/refman/5.7/en/optimize-overview.html

and also Optimize your php server

and if you have faster query why don't you use raw query object from larval

$replay = DB::select('select * from replays where id in (
select replay_id from players where battletag_name = ?) 
order by id asc limit 100', ['test']
); 


回答3:

You can use left join

$replies = Replay::orderBy('replays.id')
            ->leftJoin('players', function ($join) {
                $join->on('replays.id', '=', 'players.replay_id');
            })
            ->take(100)
            ->get();