This takes 0.001 seconds to execute and it uses index seek
SELECT * FROM CUSTOMER WHERE ID IN (1008,1122)
Now I have a stored procedure U_VIP which returns the same ID as example one (1008,1122), and it takes only 0.001 second to execute
SELECT ID FROM U_VIP //returns (1008,1122)
Now when I combine them, it takes around half-a-second to execute and index is not used
SELECT * FROM CUSTOMER WHERE ID IN (SELECT ID FROM U_VIP)
I've simplified the example above, in actual application the performance is impacted by much higher magnitude. How to force Firebird to use index in this case?
**Using Firebird 2.1
** EDIT **
Base on Mark's answer, use JOIN does improve the execution time because it is now doing index seek.
SELECT CUSTOMER.*
FROM CUSTOMER
INNER JOIN U_VIP ON U_VIP.ID = CUSTOMER.ID
This is great, however, it introduces another problem for me which I'll try to explain in the following example.
SELECT CUSTOMER.*
FROM CUSTOMER
WHERE (:AREAID = 0 OR ID IN (SELECT ID FROM U_VIP(:AREAID)))
Using where clause, I can conditionally apply the filter base on whether :AREAID is supplied by the user. How do I achieve the same when I replace the where clause with a join?
Something like:
SELECT CUSTOMER.*
FROM CUSTOMER
{IF :AREAID > 0 THEN}
INNER JOIN (SELECT ID FROM U_VIP(:AREAID)) VIP ON VIP.ID = CUSTOMER.ID
{END IF}
Which of course, Firebird dislikes the part with braces =/
Instead of IN
, you need to use EXISTS
or an INNER JOIN
. I am not entirely sure about the details, but I believe in your query the CUSTOMER
table is fully read, evaluating the result of the subquery for every row (maybe even executing the subquery for every row). As the optimizer doesn't know the number of results of the subquery in advance, it can't create an optimization like it can if you use a fixed number of literal values like in your first query.
Try changing your query to:
SELECT *
FROM CUSTOMER
WHERE EXISTS (SELECT 1 FROM U_VIP WHERE U_VIP.ID = CUSTOMER.ID)
Or:
SELECT CUSTOMER.*
FROM CUSTOMER
INNER JOIN U_VIP ON U_VIP.ID = CUSTOMER.ID
Or (changing the order sometimes can lead to better performance):
SELECT CUSTOMER.*
FROM U_VIP
INNER JOIN CUSTOMER ON CUSTOMER.ID = U_VIP.ID
In general I would expect those queries to perform better than the query with IN
.
Edit in response to update
Based on your updated question I can think of multiple solutions, I am not entirely sure on their performance though.
- Use separate queries for
:AREAID
is 0 and :AREAID
is not 0
- Use a stored procedure or
EXECUTE BLOCK
with an EXECUTE STATEMENT
with a dynamically built statement (variant of the previous)
- Make the stored procedure U_VIP return all customers if
:AREAID
is 0
- Use an additional
JOIN
condition OR :AREAID = 0
; this might not yield results if U_VIP returns nothing for 0 (and might not perform *)
- Use a
LEFT JOIN
and add WHERE U_VIP.ID IS NOT NULL OR :AREAID = 0
(might not perform *)
- Use a
UNION
of the 'normal' query and a second query on CUSTOMER
with WHERE :AREAID = 0
(might not perform *)
For (*) see the 'Smart logic' anti-pattern
For the dynamically built query you can think of something like:
EXECUTE BLOCK (INPUTCONDITION INTEGER = ?)
RETURNS (ID INTEGER)
AS
DECLARE VARIABLE QUERY VARCHAR(6400);
BEGIN
QUERY = 'SELECT a.ID FROM SORT_TEST a';
IF (INPUTCONDITION <> 0) then
QUERY = QUERY || ' WHERE a.ID = ' || INPUTCONDITION;
FOR EXECUTE STATEMENT QUERY INTO :ID
DO
SUSPEND;
END
In this example the value 0 for INPUTCONDITION
will generate a query without WHERE
-clause, and for other inputs a query with a WHERE
-clause. Doing it like this is prone to SQL injection if the parameter is a (VAR)CHAR
or BLOB
, so be careful. You could also consider two branches where one uses EXECUTE STATEMENT
with parameters and the other without.
Instead of EXECUTE BLOCK
, you can also use a selectable procedure like you already use for U_VIP
; EXECUTE BLOCK
is essentially a stored procedure that isn't stored in the database.
See also 'Myth: dynamic SQL is slow'