-->

How much cost check constraints in Postgres 9.x?

2019-04-16 05:40发布

问题:

I'd like to know if there are some benchmark to compare how much cost insert some check constraints on a table of 60 columns where on 20 i'd like to insert a constraints of NotEmpty and on 6 rows NotNull.

My case is that i have on my table Empty values and Null values (that in my case means always "no data").

I'd like to unify that data values with just one.

That's why I'm thinking to insert NotEmpty constraints on columns, because as i have read null value are not heavy (in byte size) as like empty values (and respect his real meaning).

But from other side NotNull Constraint are more deep level than check constraint, and maybe it performs better...

When i'm doing that changing, I was raised this question: - How much cost check constraints in Postgres 9.x in INSERT and UPDATE operation?

Because if costs to much maybe it's better to have this situation with both values and when i'll retrieve it using a coalesce function...

What's about yours experiences or benchmarks?

回答1:

Some people try to avoid NULL values, claiming the logic would be confusing.

I am not one of them. NULL values are just fine for columns with no data. They are certainly the cheapest way to store "empty" columns - for disk space as well as performance (the main effect being smaller tables and indices):
Does not using NULL in PostgreSQL still use a NULL bitmap in the header?
Does setting "NOT NULL" on a column in postgresql increase performance?
Do nullable columns occupy additional space in PostgreSQL?

Once you understand the nature of NULL values, there is no reason to avoid them. Postgres offers a variety of functions to deal with NULLs. colaesce(), nullif(), concat(), concat_ws(), ...

Generally, as far as performance is concerned, a NOT NULL constraint beats a CHECK constraint and both beat triggers by a log shot. But even simple triggers are cheap. The cost of a NOT NULL constraint is next to nothing. Also, all of these only affect write operations, but in most applications read operations dominate.

The most relevant impact on performance (sub-optimal indices and queries aside) therefore is the size of tables and indices or, more importantly, the number of tuples per data page. Bigger tuples lead to slower performance for most use cases. The number of data pages that have to be read to satisfy a query increases accordingly. Available cache memory is saturated earlier.

I don't have a benchmark ready, but it's best to test for your particular environment anyway. These are just simple rules of thumb. Reality is a lot more complex.