Sunday, August 30, 2020

BI/DW Best Practice #0 - What may happen if we don't follow the best practices

BI Best practices guide us how to develop BI projects by introducing some principles, recommended methodologies and suggestions. By following them, we can avoid being trapped by some common troubles. Before we start introducing some practices, I would like to tell you some common troubles first. I classified them into three major categories using my words. They are the Performance Trap, Change Trap and Information Trap. 


Performance Traps

They are those performance issues faced by BI projects due to poor design or the challenging data volume and business processes. They include the slow user queries, lengthy downtime and high latency. They are usually come from a poorly designed data model (schema), low performance ETL processes (processes) and poor maintenance of the data warehouse (index; purging and archiving).

A well designed data model and architecture should be able to avoid troubling user queries. The joining between two (or more) big tables is a typical troubling query which can't enjoy the high performance hash join. Proper dimension data modeling techniques should handle most of such cases. De-normalization, star schema, degenerated dimension, fast changing dimension, derived fact tables and clustered DW are some of the available techniques that may help avoiding such troubling user queries. 

The low performance ETL processes usually come from the overhead in the ETL processes. The unnecessary reprocessing of loaded data can be the major reason. We always recommend delta load instead. Other reasons may be the unnecessary db constraints/indexes/triggers, unnecessary sorting operations and improper CDC and DML methodologies.

Poor maintenance are usually the index fragmentation, outdated index statistics and lack of regular housekeeping.

Change Traps

They are those challenges when we feel difficulties in enhancing the BI/DW solution. The solution may work very well at some point of time. However, people find it difficult to change per their need.

The market, business and technologies keep change. They are not avoidable. We need a solution capable to change. It means that it can be changed without paying painful efforts.

The changes may come in different ways. The basic changes are the data content. They include data patching or reloading to fix wrong data. How your data model designed to be easily patched? Are your ETL designed rerunnable to handle different situations? The more advanced change can be the data structure. The reasons include the source system upgrade/revamp, introducing new business processes which mean new dimensions and facts, and the change of business rules (formulas, association key, etc.). 

Change is common. The BI solution may get trouble after one of its source system revamped. Normally, we can modify the interface and everything should work well. However, if it is a poorly designed BI solution, it can be too complicated to make such a modification accordingly. Instead, people may build some wrapper layer in the middle to handle the change. Other people may even use some manual processes to act as such a wrapper layer. They can be good interim solution. But they even make the problem more complicated if they last for long term.

Information Traps

It happens when users are not able to make valuable business decision from the information provided by the BI solution. They are typically due to data quality issues, fail to achieve desired outcome, out of alignment issues or user adaption issues. Either users don't want to use the system, or the outcome are useless (wrong), or both.

Data quality issues usually happens when lacking of data governance in the organization. Without quality data source, the resultant analysis becomes not reliable.

The reason of fail to achieve desired outcome can be come from different stage of project, from requirement analysis to testing. Typically a communication issues in the requirement analysis or lack of proper testing. Finally, the reports become useless due to wrong business rules applied.

Out of alignment issues happens when lacking a standard or alignment of the business terminologies used in the BI solution. Users finally interpret the report differently between different business units due to the different understanding of the business terminologies. (e.g. Does "Profit" before the deduction of tax and interest expense?)

User adaption issues happen when users feel challenges in moving to the new BI solution. Users become avoidance to use it and leave it idle. It is usually due to release too much reports in a too short period without proper user training. 

Conclusion

Of course, there are other challenges such as cost. But I am not planning to go through them here but just focus in these three categories. I think that they are the most most related to the best practices that I am going to talk about. 

No comments:

Post a Comment

Extract: Performance Tips

Below listed some common performance tips for extract queries. Extract required columns only and specify in the query, avoid select * Extrac...