SCD Type 6, a Combination of SCD Type 1, 2 and 3
In couple of our previous articles, we discussed how to design and implement SCD Type1, Type 2 and Type 3. We always can not fulfill all the business requirements just by these basic SCD Types. So here lets see what is SCD Type 6 and what it offers beyond the basic SCD Types.
What is SCD Type 6
Its a hybrid approach, a combination of all three basic SCD techniques, hence its named 1+2+3 = 6. Type 6 is particularly applicable if you want to maintain complete history like Type 2 and would also like to have an easy way to effect on current version like Type 3.
Slowly Changing Dimension Series
Part I : SCD Type 1.
Part II : SCD Type 2.
Part III : SCD Type 3.
Part IV : SCD Type 4.
Part V : SCD Type 6.
Part I : SCD Type 1.
Part II : SCD Type 2.
Part III : SCD Type 3.
Part IV : SCD Type 4.
Part V : SCD Type 6.
- Type 2 : To track the historical changes as they occur.
- Type 3 : To include the "Current" attribute.
- Type 1 : Type 3 attribute is updated as Type 1 attribute.
Lets see consider Customer Dimension, which is modeled as Type 6. This dimension will have two INCOME GROUP attributes, one to capture current and other to capture the historical changes. Initially the customer is in LOW income group, so both income attributes have the same values.
Below image shows the initial data in Customer Dimension.
Below image shows the initial data in Customer Dimension.
When the customer's income group is changed, a new dimension record is added to keep the historical changes for the customer. In the new dimension record current income group will be identical to income group. For all previous instances of customer dimension rows, the current income attribute will be overwritten to reflect the latest value.
Subsequent changes are also treated the same.
When to Use SCD Type 6
Data warehousing teams are quite often asked to preserve historical attributes, while also supporting the ability to report historical performance data according to current attribute values. SCD Type 6 is the answer for such requirements.
If you want to see the historical facts based on the current attribute values, we will filter or summarize on the current attributes. If we summarize on the historical attribute, we will see facts as they sum up at a point of time.
Here is our Customer Dimension example, we can use INCOME_GRP column to see the facts values at a point of time and CURR_INCOME_GRP to get the historical fact values based on the current attribute value.
Here is our Customer Dimension example, we can use INCOME_GRP column to see the facts values at a point of time and CURR_INCOME_GRP to get the historical fact values based on the current attribute value.