Is it a slow query? Can it be improved?

2019-04-29 23:40发布

问题:

I was going through SQLZOO "SELECT within SELECT tutorial" and here's one of the queries that did the job (task 7)

world(name, continent, area, population, gdp)

SELECT w1.name, w1.continent, w1.population 
FROM world w1
WHERE 25000000 >= ALL(SELECT w2.population FROM world w2 WHERE w2.continent=w1.continent)

My questions are about effectiveness of such query. The sub-query will run for each row (country) of the main query and thus repeatedly re-populating the ALL list for a given continent.

  1. Should I be concerned or will Oracle optimization somehow take care of it?
  2. Can it be reprogrammed without a correlated sub-query?

回答1:

If you want to rewrite the query without a correalted subquery, here is one way:

SELECT w1.name, w1.continent, w1.population 
FROM world w1
  JOIN
    ( SELECT continent, MAX(population) AS max_population
      FROM world
      GROUP BY continent
    ) c
    ON c.continent = w1.continent
WHERE 25000000 >= c.max_population ;

I do not imply that this will be faster. Oracle's optimizer is pretty good and this is a simple overall query, however you write it. Here's another simplification:

SELECT w1.name, w1.continent, w1.population 
FROM world w1
  JOIN
    ( SELECT continent
      FROM world
      GROUP BY continent
      HAVING MAX(population) <= 25000000 
    ) c
    ON c.continent = w1.continent ;


回答2:

First of all you need to understand how oracle transform this query to evaluate .

SELECT w1.name
     , w1.continent
     , w1.population 
FROM world w1
WHERE 25000000 >= ALL(SELECT w2.population 
                       FROM world w2 
                      WHERE w2.continent=w1.continent
                     );

Now the optimizer transforms a condition that uses the ALL comparison operator followed by a subquery into an equivalent condition that uses the ANY comparison operator and a complementary comparison operator

  SELECT w1.name
        , w1.continent
       , w1.population 
  FROM world w1
   WHERE NOT(25000000 < ANY (SELECT w2.population 
                        FROM world w2 
                      WHERE w2.continent=w1.continent)
          );

The optimizer then further transforms the second query into the following query using the rule for transforming conditions with the ANY comparison operator, followed by a correlated subquery:

  SELECT w1.name
       , w1.continent
       , w1.population 
   FROM world w1
  WHERE
     NOT EXISTS (SELECT w2.population 
                  FROM world w2 
                 WHERE w2.continent=w1.continent
                   AND 25000000 < w2.population
                );

This I have taken from oracle source Link

For Your questions:

  1. Yes oracle will take care of this ,as the transformation suggest ,how oracle transform the above query.But better understand how this end result query work .
  2. yes ,this can be done without correlated sub query ,but anyways you have to join with the same table because you need to compare other records in the table which is having the same continent.[Please Correct me if i am wrong ]


回答3:

you can simplify this without the need to scan the table twice:

select a.name, a.continent, a.population, a.max_pop
  from (select w1.name, w1.continent, w1.population, 
               max(w1.population) over (partition by w1.continent) max_pop
          from world w1
       ) a 
where 25000000 >= a.max_pop;