I have a PostgreSQL database containing various education data such school-level test scores and enrollment figures. I need to separate enrollment from test scores because the data is on different grains. Even though enrollment is on a different granularity from the test-score data, many of the dimensions are the same. For example, I have:
~ ---------------------------------------------------------------------------------~
| Test Scores Fact |
|-------------|-----------|----------|-----------|--------------|------------|-----|
| school_code | test_code | grade_id | gender_id | ethnicity_id | subject_id | ... |
|-------------|-----------|----------|-----------|--------------|------------|-----|
~ --------------------------------------------------------~
| Enrollment Fact |
|-------------|----------|-----------|--------------|-----|
| school_code | grade_id | gender_id | ethnicity_id | ... |
|-------------|----------|-----------|--------------|-----|
This structure is fine on the backend, but in Qlikview, this creates a synthetic key. The solution for synthetic keys seems to usually be replacing it with a link table via Qlikview scripting, which has been my approach as well. But this does not seem to scale, as when I add a third fact table (on yet another grain) that contains more of the same dimensions, if I create another link table, now my two link tables start to associate as they contain several commonly named fields, and Qlikview's response is to create more synthetic keys?
I'm relatively new to Qlikview and am working by myself. How are multiple facts of different grains with common dimensions usually handled?
EDIT:
I've provided my solution to this problem which has been working in a production environment for just under a year! See my answer below...
There are two main strategies to modelling data in QlikView to handle multiple fact tables:
Append your fact tables into one single fact table - usually referred to as a CONCATENATED FACT as QlikView's syntax for appending data to tables is by use of the CONCATENATE prefix (the equivalent of a SQL UNION operation)
Build a link table (what you have done so far) For a majority of implementations, option 1 is the appropriate method. Attributes of a CONCATENATED fact can be summarised as:
Positives:
Negatives:
How to construct Link Tables is a complex subject but relies upon traditional database linking table design techniques. It is easy to go wrong and produce linking tables that may seem to produce the correct results in the front-end but is excessively large, consuming memory and CPU resources.
In my experience, a poorly modelled QlikView data model is the most common culprit for causing poor performance.
I hope this quick, far from exhaustive, introduction to multi-fact modelling in QlikView proves of some help and sets you on the right course.
One of the inputs to your cartesian dimension would be 'N/A' against Subject and Test Code (since that is not in the enrollments table)
So when you measure by 'Gender' the Test Scores match against dimension records with valid Subjects and Test Codes, and Enrollment matches against records with 'N/A' Subjects and Test Codes
Then when you roll up by Gender, everyhing 'just works' nicely.
Seeing the popularity of this question, I'm going to add my actual solution to the mix so people have an example to work from, which for some reason is really hard to find for such a common problem...
I proceeded with creating a Link Table. This solution still to this day feels like a hack as it creates one huge table containing the Cartesian product of every one of your keys in all of your fact tables... but it does work.
The problem: You have multiple fact tables in your database; an occurrence in almost every database ever. Some (or all) of these fact tables share the same key fields; no problem, right? Wrong. Unfortunately, due to Qlik's associative nature, instead of each one of your fact tables linking nicely to their lookup tables, your fact tables are now associating with each other and wreaking havoc on your data model; creating circular references and untold amounts of synthetic keys.
The Solution: Create a Link Table. Sounds simple, right? Well, it is, but it's also very poorly documented and hard to understand without an initial explanation. You might be wondering... what's a Link Table? It is the Cartesian product of all keys from all of your fact tables. How does this correct the problem? It removes all of the unwanted associations between your fact tables as each will now only contain a single unique concatenated key. Those unique keys will associate with the Link Table only, which contains all your unique concatenated keys as well as all the individual keys. The Link Table will subsequently associate with your lookup tables and all will be well.
Implementation:
This implementation will use the two tables contained in my question above;
test_scores_fact
andenrollment_fact
.As you can see, the two fact tables have overlapping keys,
school_code
,grade_id
,gender_id
, andethnicity_id
. In a relational model, each key field has a corresponding table with additional information about the key. This model does not jive with Qlikview's associative nature as Qlikview associates tables based on the name of a field; even when you don't want it to. You do want like named fields to associate to their lookup tables, however you don't want like named fields in your fact tables to associate. Unfortunately you cannot stop this behavior. You must implement a Link Table...In your Qlikview script, create a temporary fact table, which loads in all fields from your database table:
Concatenate your keys and remove all individual keys:
Repeat Steps 1 & 2 for each fact table:
Create your Link Table by concatenating your individual keys into a single table:
Drop your temp tables so they do not appear in your data model:
This will remove all associations between your fact tables as there now exist no common field names between them. Each fact table will link to the link table via the created concatenated key. The link table will then associate with each individual lookup table. Your Qlikview data model will not contain any synthetic keys or circular references.
If you create another fact table in the future, just follow steps 1 & 2 again, and add any new individual keys to the Link Table and also add the new concatenated key to the Link Table as well. It scales with little effort.
Good Luck!
The two quickest ways I can think of:
A) You can just left join the fact table into the corresponding tables that they are used in. You will just need to rename the fields to avoid conflicts with the other tables.
B)You can rename the common fields, which can be done by
QUALIFY
(before you load the fact tables) andUNQUALIFY
(after you load the fact tables)Assuming that the fact tables have unique id field names that can be linked to the main tables , you shouldn't have to rename anything in the main tables
I'd go with B-1, since that seems a little less of a hassle.
EDIT: If you want to create a link table from these, you can concatenate the fact tables into one table where you put all the columns into it (there will be nulls for a lot of the columns, but QlikView is good with nulls).
What i usually do is load the fact tables in and create an id field (either RowNo() or autonumberhash128([list of unique id fieldnames]), then when i load them into a link table, I include that id field in the link table as well. Finally, i drop the all the common fields from the fact tables, so they only exist in the link table.