There are several different source systems. So, we may have errors from different reasons. We may have three options to handle. Halt, Proceed but reject failed records and Proceed with notification.
Reject records is one of the useful error handling. We simply reject and notify the users. So that they can fix and resend those data. At the same time, other good records can be proceeded. In this approach, a nice reload mechanism should be available to reload the fixed records. Technically, it looks good. However, you still need to estimate how long and how often the data will be fixed in source system.
The last common method is to proceed the data anyway. We may mark them with quality issue in some way (e.g. error reports, audit dimension, etc.). "Augment" is one of the major techniques to proceed error records.
Sometime, wrong data may be found after they have been loaded for few weeks. A flexible ETL processes (runnable for any period, portions, etc.) will be very useful. So, we can use the same set of logic to reload historical data.
No comments:
Post a Comment