Validate if a column has a null value

2019-07-21 01:05发布

Which SQL would be faster to validate if a particular column has a null value or not, why?

1) SELECT * FROM TABLE1 WHERE COL1 IS NULL

Execute this query and then check if you are able to read any records. If yes there are null values.

2) SELECT COUNT(COL1) FROM TABLE1 WHERE COL1 IS NULL

Read the count which is returned to determine if there are any null records

Working with Oracle10g and SQLServer2005.

4条回答
我命由我不由天
2楼-- · 2019-07-21 01:41

Count(columnName) will NEVER count NULL values, count skips NULLS when you specify a column name and does count NULLS when you use *

run this

CREATE TABLE testnulls (ID INT)
INSERT INTO testnulls VALUES (1)
INSERT INTO testnulls VALUES (2)
INSERT INTO testnulls VALUES (null)

SELECT count(*) FROM testnulls WHERE ID IS NULL --1

SELECT count(ID) FROM testnulls WHERE ID IS NULL --0

I would use exists instead since it is a boolean operation and will stop at the first occurance of NULL

IF EXISTS (SELECT 1 FROM testnulls WHERE ID IS NULL)
PRINT 'YES'
ELSE
PRINT 'NO'
查看更多
我命由我不由天
3楼-- · 2019-07-21 01:41

I don't know about Oracle, but for SQL Server this option is probably going to be fastest of all:

SELECT TOP 1 COL1 FROM TABLE1 WHERE COL1 IS NULL;

That way the DBMS only has to read a single row before giving you your answer; the other options have to read all non-null rows. And I've specified COL1 instead of *, so it might be possible to satisfy the query with an index on COL1, making the query faster still.

查看更多
Anthone
4楼-- · 2019-07-21 01:46

Multiple Solutions (Column Contains Some NULLs | Column is All NULLs * Test Single Column | Test Multiple Columns with Tabular Results)

If you need to test multiple columns, you could use the following:

Column_1 Column_2 Column_3
-------- -------- --------
1        2        NULL
1        NULL     NULL
5        6        NULL

First, test for NULLs and count them:

select 
    sum(case when Column_1 is null then 1 else 0 end) as Column_1, 
    sum(case when Column_2 is null then 1 else 0 end) as Column_2, 
    sum(case when Column_3 is null then 1 else 0 end) as Column_3,
from TestTable 

Yields a count of NULLs:

Column_1  Column_2  Column_3
0         1         3

Where the result is 0, there are no NULLs.

Second, let's count the non-NULLs:

select 
    sum(case when Column_1 is null then 0 else 1 end) as Column_1, 
    sum(case when Column_2 is null then 0 else 1 end) as Column_2, 
    sum(case when Column_3 is null then 0 else 1 end) as Column_3,
from TestTable

...But because we're counting non-NULLs here, this can be simplified to:

select 
    count(Column_1) as Column_1, 
    count(Column_2) as Column_2, 
    count(Column_3) as Column_3,
from TestTable

Either one yields:

Column_1  Column_2  Column_3
3         2         0

Where the result is 0, the column is entirely made up of NULLs.

Lastly, if you only need to check a specific column, then TOP 1 is quicker because it should stop at the first hit. You can then optionally use count(*) to give a boolean-style result:

select count(*) from (select top 1 'There is at least one NULL' AS note from TestTable where Column_3 is NULL) a

0 = There are no NULLs, 1 = There is at least one NULL

select count(*) from (select top 1 'There is at least one non-NULL' AS note from TestTable where Column_3 is not NULL) a

0 = They are all NULL, 1 = There is at least one non-NULL

I hope this helps.

查看更多
霸刀☆藐视天下
5楼-- · 2019-07-21 01:53

Building on kquinn's answer, in Oracle that would be

SELECT COL1 FROM TABLE1 WHERE COL1 IS NULL AND ROWNUM = 1;

That way the DBMS only has to read a single row before giving you your answer;

That statement is misleading, however. It has to read all rows until it finds one with the missing column value. Then it can stop and return that row.

If there is no such row, it will read the whole table.

so it might be possible to satisfy the query with an index on COL1, making the query faster still.

Specifying only COL1 will not have too much impact, at least on Oracle, where (regular B-Tree) indices cannot be used to find NULL values.

You may want to select more columns anyway (such as the primary key value) if you are interested in identifiying the row later.

查看更多
登录 后发表回答