We sometime hear the argument of using surrogate key or business key. There are number of reasons why we use surrogate key. I am not going through them one by one here. I would just emphasize my key reason here - to make the model capable to changes.
For example, if the business key of a dimension changed (by adding one more column) per the business change, it will be a disaster if we are using business key. The following are likely our task list
- [Easy] Add column to one dimension table
- [Easy] Existing data patch to this new column in the dimension table
- [Slightly harder] Add column to many fact tables to include this new foreign key column
- [Slightly harder] Data patch to those new columns in fact tables
- [Slightly harder] change several ETL jobs for the key lookup logic
- [Slightly harder] update the semantic data in BI tools for the modified table association
- [Disaster] update the SQL of hundreds of existing static reports.
However, if we are using surrogate key, our task list is just
- [Easy] Add column to one dimension table
- [Easy] Data patch to this new column in the dimension table
- [Slightly harder] change several ETL jobs for the key lookup logic
No comments:
Post a Comment