I came up against a strange problem in Postgres yesterday when trying to filter out user ids from a stats table. When we did, for example, user_id != 24
, postgres excluded the rows where user_id
is NULL
as well.
I created the following test code which shows the same results.
CREATE TEMPORARY TABLE test1 (
id int DEFAULT NULL
);
INSERT INTO test1 (id) VALUES (1), (2), (3), (4), (5), (2), (4), (6),
(4), (7), (5), (9), (5), (3), (6), (4), (3), (7),
(NULL), (NULL), (NULL), (NULL), (NULL), (NULL), (NULL);
SELECT COUNT(*) FROM test1;
SELECT id, COUNT(*) as count
FROM test1
GROUP BY id;
SELECT id, COUNT(*) as count
FROM test1
WHERE id != 1
GROUP BY id;
SELECT id, COUNT(*) as count
FROM test1
WHERE (id != 1 OR id IS NULL)
GROUP BY id;
The first query just counts all the rows. The second counts the number of each value, including nulls. The third excludes the value 1 and also all the nulls. The fourth is a work around to exclude value 1 and still include the nulls.
For what I'm trying to use this query for, null values should always be included.
Is the work around the only way to do this? Is this expected Postgres behaviour?
The
IS DISTINCT FROM
predicate exists for this purpose. It's described as:So just doing
id IS DISTINCT FROM 1
should work.Reference: https://www.postgresql.org/docs/11/functions-comparison.html
Your "work around" is the usual way to do it. Everything is behaving as expected.
The reason is simple: nulls are neither equal, nor not equal, to anything. This makes sense when you consider that null means "unknown", and the truth of a comparison to an unknown value is also unknown.
The corollary is that:
null = null
is not truenull = some_value
is not truenull != some_value
is not trueThe two special comparisons
IS NULL
andIS NOT NULL
exist to deal with testing if a column is, or is not,null
. No other comparisons to null can be true.