Denormalising a fully normalised table

2019-08-31 00:36发布

问题:

Can anyone advise on how to go about denormalising a fully normalised table? I've not been able to find much on it despite googling.

Whenever a record is updated (though not inserted, but lets not worry about that now) in TableA a record is inserted into HistoryOfTableA with the values of the fields which have changed and an associated timestamp when it was carried out.

eg.

TableA fields:

TableA_Id, FieldA, FieldB, FieldC, FieldD.... etc

HistoryOfTableA records:

HistID, TableA_Id, FieldChanged, OldValue, NewValue, DateCreated
1, 1, 'FieldA', 1, 2, <2013-03-18 12:20:00>
2, 1, 'FieldB', A, B, <2013-03-18 12:20:00>
3, 1, 'FieldC', A, B, <2013-03-18 12:20:00>

The situation is I'm looking to create some SQL for reporting purposes. So I want to be able to specify a point in time, and be able pull together those hostory table entries and work out what the state of that record from TableA was at that time.

I'm thinking I can create tables based on the history table and joined on the date created and the Id from table A

eg.

select HistA.NewValue, 
       HistB.NewValue, 
       .... 
from FieldA_HistoryOfTableA HistA 
     inner join FieldB_HistoryOfTableA HistB on HistA.DateCreated = HistB.DateCreated
       and HistA.TableA_Id = HistB.TableA_Id 
     inner join ... etc
where HistA.FieldChanged = 'FieldA'
and HistB.FieldChanged = 'FieldB'
and .... etc...

But I don't think this is going to give me all that I want and I may not be able to do this purely in SQL. As an aside there are 20 fields in TableA so trying to join 20 tables may not be wise.

回答1:

Your problem is not that the table is normalised, but rather that it isn't.

The data identified by fields in one table is identified by data in another table, which makes it very complicated to query.

A fully normalised version would also store the fields from TableA in a separate table:

TableA
------------
TableA_Id

TableAFields
-------------
TableA_Id
FieldId
Value

Field
---------
FieldId
FieldName

HistoryOfTableA
----------------
TableA_Id
FieldId
OldValue
ChangedDate

Now you can join in the fields from the history table. That will be a bit tricky, but at least it won't be a query with 20 joins.



回答2:

If you want the most recent records from the history table, you can use the following SQL:

select ht.*
from (select ht.*,
             ROW_NUMBER() over (partition by FieldChanged, TableA_Id order by DateCreated desc) as seqnum
      from HistoryOfTableA ht
      where datecreated <= YOURDATEHERE
     ) ht
where seqnum = 1

If you want this as a single record, you should either pivot or do an aggregation. Here is the latter approach:

select ht.TableA_id,
       max(case when FieldName = 'FieldA' then NewValue end) as FieldA,
       . . .
from (select ht.*,
             ROW_NUMBER() over (partition by FieldChanged, TableA_Id order by DateCreated desc) as seqnum
      from HistoryOfTableA ht
      where datecreated <= YOURDATEHERE
     ) ht
where seqnum = 1
group by TableA_id


回答3:

You can try using a XML column in the TableA to record the history of every row in TableA. i.e. Every row has a column which is of type XML and records the history of that row. Then you can pull the rows into your application(in a batched manner if the number of rows are huge), and find out the history of each record in TableA