I was going through SQLZOO "SELECT within SELECT tutorial" and here's one of the queries that did the job (task 7)
world(name, continent, area, population, gdp)
SELECT w1.name, w1.continent, w1.population
FROM world w1
WHERE 25000000 >= ALL(SELECT w2.population FROM world w2 WHERE w2.continent=w1.continent)
My questions are about effectiveness of such query. The sub-query will run for each row (country) of the main query and thus repeatedly re-populating the ALL list for a given continent.
- Should I be concerned or will Oracle optimization somehow take care of it?
- Can it be reprogrammed without a correlated sub-query?
If you want to rewrite the query without a correalted subquery, here is one way:
SELECT w1.name, w1.continent, w1.population
FROM world w1
JOIN
( SELECT continent, MAX(population) AS max_population
FROM world
GROUP BY continent
) c
ON c.continent = w1.continent
WHERE 25000000 >= c.max_population ;
I do not imply that this will be faster. Oracle's optimizer is pretty good and this is a simple overall query, however you write it. Here's another simplification:
SELECT w1.name, w1.continent, w1.population
FROM world w1
JOIN
( SELECT continent
FROM world
GROUP BY continent
HAVING MAX(population) <= 25000000
) c
ON c.continent = w1.continent ;
First of all you need to understand how oracle transform this query to evaluate .
SELECT w1.name
, w1.continent
, w1.population
FROM world w1
WHERE 25000000 >= ALL(SELECT w2.population
FROM world w2
WHERE w2.continent=w1.continent
);
Now the optimizer transforms a condition that uses the ALL comparison operator followed by a subquery into an equivalent condition that uses the ANY comparison operator and a complementary comparison operator
SELECT w1.name
, w1.continent
, w1.population
FROM world w1
WHERE NOT(25000000 < ANY (SELECT w2.population
FROM world w2
WHERE w2.continent=w1.continent)
);
The optimizer then further transforms the second query into the following query using the rule for transforming conditions with the ANY comparison operator, followed by a correlated subquery:
SELECT w1.name
, w1.continent
, w1.population
FROM world w1
WHERE
NOT EXISTS (SELECT w2.population
FROM world w2
WHERE w2.continent=w1.continent
AND 25000000 < w2.population
);
This I have taken from oracle source Link
For Your questions:
- Yes oracle will take care of this ,as the transformation suggest ,how oracle transform the above query.But better understand how this end result query work .
- yes ,this can be done without correlated sub query ,but anyways you have to join with the same table because you need to compare other records in the table which is having the same continent.[Please Correct me if i am wrong ]
you can simplify this without the need to scan the table twice:
select a.name, a.continent, a.population, a.max_pop
from (select w1.name, w1.continent, w1.population,
max(w1.population) over (partition by w1.continent) max_pop
from world w1
) a
where 25000000 >= a.max_pop;