Data Warehouse Testing
Today’s dynamic Business environment demands advanced business intelligence and analytical solutions to make quick and effective decisions. A continuous and comprehensive framework is required to test and validate these high-volume and high-complexity Data Warehouse (DW) or Business Intelligence (BI) systems. Testing of the BI/DW system—not only before production movement, but also on an on-going basis—is of critical importance. Continuous DWH validation ensures that decision making is not happening across broken systems.
Marlabs provides end-to-end DW BI testing Services. Marlabs DW BI testing focus is to validate data completeness and data correctness apart from validating the ETL process. Our optimizing and automation techniques help clients accelerate the overall testing cycle.
Marlabs testing approach includes the following strategies:
- Business and functional testing
- Audit report/data quality report
- Test optimization using statistical techniques
- Automation of ETL data validation
The above strategies, which are our key differentiators, help in achieving:
- Exhaustive validation and test coverage
- Thorough validation of data quality in the DWH marts before movement to production
- Continuous validation of ETL through the lifetime of the DWH
- Test effort reduction in regression and continuous data validation
ETL Validation framework is an automation framework for validating correctness and completeness of data for measure across dimensions. The solution is based on the creation of a metadata repository containing source and target tables, columns, joins and the transformations, and deriving attribute and measure validation test scenario. The solution essentially checks the correctness and completeness of the data in the target by comparing with the data taken from source and applying the required transformation, joins and filters.
- Increased quality of data in production and accurate BI reports
- Periodic validation of ETL processes (weekend runs)
- Daily validation of business critical data
- Checkpoint validation of larger sample sets
- Regression testing, productivity improvement
- Validation of data migrations
- Identifies outlier in data load stage (before ETL validation process)
- Identifies raw data causing issue
- Based on Exponential Weighted Moving Average, and Mahalanobis Distance method
- Data quality issues can be identified at an earlier stage.
- No need to wait for BI Reports to identify anomalies
Quality Audit Reports
- Involves comparing predefined audit items between source and target
- Audit items include statistical aggregates like mean, median, and std. deviation for each column of the table
- Displays results through SSRS reports
- Identifies data quality issues
- Saves manual effort in regression and data migration.
Marlabs validated the completeness and audit-level correctness of the data points from source systems to the data mart for a client in the education domain.