Updating dimension tables using SQL Server (BIDs o

2019-09-02 08:15发布

问题:

I'm quite confused as to how I'm supposed to be adding dimension members to my data warehouse. Let's say that TOWN_NAME is a dimension table that links town_Id to a town_name. So, now, I have 1000 customer names, and they are from 9 towns. Suddenly, in my next ETL process, a customer ends up being added whose town is not amongst that 9 towns i have in my dimension. So I need to add a member to my dimension table. Which step/process in BIDS or DATA TOOLS (BIDS 2012) would have I to use? How should this be one? I'm quite lost as to what could be done.

回答1:

The usual pattern - regardless of what tools you're using to populate your data warehouse - is to populate your Dimension before you populate your Fact, precisely to avoid this problem.

The usual way to do things is to have a package which pulls out your Dimension data from your source system(s), and then load any new rows into your Dimension table. Then, when your Fact table load happens later in the process you look-up the ID column from the Dimension using the town name. Your Fact data should then be loaded into the Fact table with the ID for the relevant town as one of its column values.

Specifically, in SSIS, you can manage this by creating a package which does your Dimension table load, and another package in the same project which does your Fact table load. Then you can control the order these happen in a couple of different ways:

  • You can create a third package which uses two Execute Package tasks to call the Dimension package first, and then the Fact package.
  • You could create a SQL Server Agent Job which first calls the Dimension package, then calls the Fact package.

If you want to be able to run everything from within Visual Studio in order in one go, then go with the first option.