People who read C.J.Date's Introduction to Database System or books of similar level should not have problems with definition of normalization and denormalization.
However, memory is not what it used to be and I find myself often looking at some design and saying that it is not normalized even though I can not find which of the normal forms it is breaking.
The actual example that illustrate it is:
If we have relations
r1 (A, B, C)
and r2 (A, D)
with FDs: AB->C and A->D
and r1
represent detailed data, while r2
is summary of that data (in another words each instance of D is a function of values in r1. in this example let it be subtotal of values C according to A from r1).
Example instance
r1 =
A B C
1 1 10
1 2 20
2 1 10
2 2 25
r2 =
A D
1 30
2 35
So, even though I can not say that it breaks for example 2NF or 3NF, I seem to be stuck on the idea that the design is still denormalised in the following sense (from Codd, E.F. "Further Normalization of the Data Base Relational Model", p. 34, commenting on the reasons to normalize beyond 1NF):
- To free the collection of relations from undesirable insertion, update and deletion dependencies;
- To reduce the need for restructuring the collection of relations as new types of data are introduced, and thus increase the life span of application programs;
- To make the relational model more informative to users;
- To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by.
As I can say, that if we define D as a sum of all Cs from r1 where A from r1 is equal to A from r2 then, if we update C in r1 and we don't update D in r2, we can end up with undesirable update dependency and the data ends up in inconsistent state I find this reason to call r1 and r2 denormalized and to think of them as denormalized. (In fact whole r2 is a function of r1 and bring zero new facts into the model; r2 = f(r1))
So the questions are
- can we call r1 and r2 denormalized?
- if yes, why? if not, why? (according to which rule? or according to which definition?)
NOTE:
To those who find the question(s) interesting enough to put in an answer, I kindly ask to provide either something quotable or to put it in a form of specific assumptions and conclusions (or in another words, if you are going to put in your opinion, please follow it with some reasoning).
EDIT I accepted dportas answer. I'll try to add a bit to it here: C.J.Date can makes a clear and strict distinction:
Much of design theory has to do with reducing redundancy; normalization reduces redundancy within relvars, orthogonality reduces it across relvars.
quoted from Database in depth: relational theory for practitioners
and on the next page
just as a failure to normalize all the way implies redundancy and can lead to certain anomalies, so too can a failure to adhere to orthogonality.