Which SQL would be faster to validate if a particular column has a null value or not, why?
1) SELECT * FROM TABLE1 WHERE COL1 IS NULL
Execute this query and then check if you are able to read any records. If yes there are null values.
2) SELECT COUNT(COL1) FROM TABLE1 WHERE COL1 IS NULL
Read the count which is returned to determine if there are any null records
Working with Oracle10g and SQLServer2005.
Count(columnName) will NEVER count NULL values, count skips NULLS when you specify a column name and does count NULLS when you use *
run this
CREATE TABLE testnulls (ID INT)
INSERT INTO testnulls VALUES (1)
INSERT INTO testnulls VALUES (2)
INSERT INTO testnulls VALUES (null)
SELECT count(*) FROM testnulls WHERE ID IS NULL --1
SELECT count(ID) FROM testnulls WHERE ID IS NULL --0
I would use exists instead since it is a boolean operation and will stop at the first occurance of NULL
IF EXISTS (SELECT 1 FROM testnulls WHERE ID IS NULL)
PRINT 'YES'
ELSE
PRINT 'NO'
Building on kquinn's answer, in Oracle that would be
SELECT COL1 FROM TABLE1 WHERE COL1 IS NULL AND ROWNUM = 1;
That way the DBMS only has to read a single row before giving you your answer;
That statement is misleading, however. It has to read all rows until it finds one with the missing column value. Then it can stop and return that row.
If there is no such row, it will read the whole table.
so it might be possible to satisfy the query with an index on COL1, making the query faster still.
Specifying only COL1 will not have too much impact, at least on Oracle, where (regular B-Tree) indices cannot be used to find NULL values.
You may want to select more columns anyway (such as the primary key value) if you are interested in identifiying the row later.
I don't know about Oracle, but for SQL Server this option is probably going to be fastest of all:
SELECT TOP 1 COL1 FROM TABLE1 WHERE COL1 IS NULL;
That way the DBMS only has to read a single row before giving you your answer; the other options have to read all non-null rows. And I've specified COL1
instead of *
, so it might be possible to satisfy the query with an index on COL1
, making the query faster still.
Multiple Solutions (Column Contains Some NULLs | Column is All NULLs * Test Single Column | Test Multiple Columns with Tabular Results)
If you need to test multiple columns, you could use the following:
Column_1 Column_2 Column_3
-------- -------- --------
1 2 NULL
1 NULL NULL
5 6 NULL
First, test for NULLs and count them:
select
sum(case when Column_1 is null then 1 else 0 end) as Column_1,
sum(case when Column_2 is null then 1 else 0 end) as Column_2,
sum(case when Column_3 is null then 1 else 0 end) as Column_3,
from TestTable
Yields a count of NULLs:
Column_1 Column_2 Column_3
0 1 3
Where the result is 0, there are no NULLs.
Second, let's count the non-NULLs:
select
sum(case when Column_1 is null then 0 else 1 end) as Column_1,
sum(case when Column_2 is null then 0 else 1 end) as Column_2,
sum(case when Column_3 is null then 0 else 1 end) as Column_3,
from TestTable
...But because we're counting non-NULLs here, this can be simplified to:
select
count(Column_1) as Column_1,
count(Column_2) as Column_2,
count(Column_3) as Column_3,
from TestTable
Either one yields:
Column_1 Column_2 Column_3
3 2 0
Where the result is 0, the column is entirely made up of NULLs.
Lastly, if you only need to check a specific column, then TOP 1 is quicker because it should stop at the first hit. You can then optionally use count(*) to give a boolean-style result:
select count(*) from (select top 1 'There is at least one NULL' AS note from TestTable where Column_3 is NULL) a
0 = There are no NULLs, 1 = There is at least one NULL
select count(*) from (select top 1 'There is at least one non-NULL' AS note from TestTable where Column_3 is not NULL) a
0 = They are all NULL, 1 = There is at least one non-NULL
I hope this helps.