Data Warehouse - Slowly Changing Dimensions with M

2019-07-02 20:25发布

问题:

As an example, let's say I have a fact table with two dimensions and one measure

FactMoney table


ProjectKey int

PersonKey int

CashAmount money


The two dimensions are defined like this:

DimProject (a type 0 dimension - i.e. static)


ProjectKey int

ProjectName varchar(50)


DimPerson (a type 2 slowly changing dimension)


PersonKey int

PersonNaturalKey int

PersonName varchar(50)

EffectiveStartDate datetime

EffectiveEndDate datetime

IsCurrent bit


Pretty straightforward so far. Now I'll introduce a Person Category concept.

DimCategory


CategoryKey int

CategoryName varchar(50)


And build an M2M relationship between DimPerson and DimCategory

BridgePersonCategory


PersonKey int

CategoryKey int


So - people can have 1..n categories.

My problem is - as Person is a slowly changing dimension, when a person's name changes, we add a new person row and update our effective dates and is current flags, no big deal.

But what do we do with the person's categories? Do we need to add more rows to the bridge table every time a new person version pops up?

And as a corollary, if a person's categories change, does that mean we need to create a new row in the person table?

回答1:

About your main question: I would say that you need to add the categories in the category table (probably copying them from the old person row). So you can continue to classify the person in the new (changed) state.

About a change of category: I would prefer to do not add a person row but add an initial-validity and an expiration date in the category table. In this way each category could be indipendently changed. But you need to be careful for not point-in-time query as you could overcount the categories



回答2:

But what do we do with the person's categories? Do we need to add more rows to the bridge table every time a new person version pops up?

Only, if you want to change the person's category. Because, only the person names changes not the person key in DimPerson Table. The relation between DimPerson and PersonCategory is still Valid. When person name changes, still he will be in the same category. Meaning a person at any point of time could be only in one category. To overcome this, you have to create a SCD2 at the junction table (PersonCategory), so that the relation is made with Keys and Effective Dates.

This will answer to your next question also. And as a corollary, if a person's categories change, does that mean we need to create a new row in the person table?

Please Let me know for any further clarifications.



回答3:

Because, only the person names changes not the person key in DimPerson Table.

I would say this is not correct if you have SCD2 on DimPerson as @ScottHerbert said. Because in this case the PK (I assume it's PersonKey) is changing, only the BK (I assume it's PersonNaturalKey) will remain the same. But you won't build the relation on the business key, because then you lose the historization which person had which category.

In my opinion you would have to add a new version to Person Cat with the new PK from person and the old PK from Category.

The same goes for Category, if the name is changing this category will get a new PK which you have to add to the Bridge table.