930 likes | 1.11k Views
Data Warehouse Testing
E N D
Business value Assurance / Advanced DWH (Testing) 1. Challenges faced by the testing team in realtime scenario 2. Challenges faced by the team in differents phases of STLC 3. What tools are available & used for testing DWH at different stages 4. Any automation tool available for DWH 5. Any tool available and used to ensure data quality 6. How it is ensured that the data sample selected ensures completeness 7. How is data reconciliation done 8. How to test bulk data 9. Some information on performance tool and how the result is analyzed
Table Of Contents 1. Challenges faced by the testing team in real-time scenario. 2. Challenges faced by the team in different phases of STLC. 3. What tools are available & used for testing DWH at different stages. 4. Any automation tool available for DWH. 5. Any tool available and used to ensure data quality. 6. How it is ensured that the data sample selected ensures completeness. 7. How is data reconciliation done. 8. How to test bulk data. 9. Some information on performance tool and how the result is analyzed.
Challenges faced by the testing team in real-time scenario. Challenges Faced: Lack of Skilled testers Results: Resulted into incomplete, insufficient and inadequacy of testing that led to spending of lot of effort in finding and reporting the bugs.
Challenges Faced: Lack of availability of standard test data / datasets during testing Results: Lead to insufficient test coverage.
Challenges Faced: The team members had insufficient knowledge of the domain standards Results: Resulted in inadequate testing.
Challenges Faced: Poor understanding of requirements and Miscommunication or no communication with the end-users during testing/development cycles Results: No specifics of what an application should or shouldn't do (the application's requirements) and lead to poor quality of testing.
Challenges Faced: Not recording non-reproducible defects Results: Many times tester came across bugs during random / exploratory testing which appeared on specific configurations and are non-reproducible. This made testing task extremely tedious and time consuming, as many times there would be random hangs in product.
Challenges Faced: Tedious manual verification and testing the complete application Results: Even though this led developers on displaying specific interpretation of results, this has to be done on wide range of datasets and is a repetitive work. Also to test each and every combination was challenging.
Challenges Faced: Interdependencies of components in the Software Results: Since the software was complex with different components, the changes in one part of software often caused breaks in other parts of the software. Pressure to handle the current functionality changes, previous working functionality checks and bug tracking.
Challenges Faced: Testing always under time constraints Results: Often there was a slippage in other phases of the project and thus reduced time for testing as there was a committed end date to customer. It was also observed that the tester could simply focus on task completion and not on the test coverage and quality of work. This testing activity was taken up as last activity in project life cycle and there was always a pressure to squeeze testing in a short time.
Challenges Faced: Test Systems inadequacy & lack of dedicated resources for test team. Under estimating testing efforts in project efforts Results: Testing time was affected because of lack of dedicated test systems given to test team, the testers got assigned to test multiple modules and the developers were finally moved on the testing job. Test engineers were forced to work at odd hours/weekends as the limited resources were in control of the development team and test engineers were given a lower priority during allocation of resources. Testing team was not involved during scoping phase and the testing team’s efforts were typically underestimated. This led to lower quality of testing as sufficient efforts could not be put in for the same.
Challenges Faced: The involvement of test team in entire life cycle is lacking Results: Test engineers were involved late in the life cycle. This limited their contribution only to black box testing. The project team didn’t use the services of the test team for the unit as well as integration testing phases. Due to the involvement testers in the testing phase, the test engineers took time to understand all the requirements of the product, and were overloaded and finally were forced to work many late hours.
Challenges Faced: Problems faced to cope with attrition Results: Few Key employees left the company at very short career intervals. Management faced hard problems to cope with attrition rate. New testers taken into project required project training from the beginning and as this is a complex project it became difficult to understand thus causing delay in release date.
Challenges Faced: Hard or subtle bug remained unnoticed Results: Since there was a lack of skilled testers and domain expertise, some testers concentrated more on finding easy bugs that did not require deep understanding.
Challenges Faced: Lack of relationship with the developers & no documentation accompanying releases provided to test team Results: It is a big challenge. There is no proper documentation accompanying releases provided to the test team. The test engineer is not aware of the known issues, main Features to be tested, etc. Hence a lot of effort is wasted.
Challenges Faced: Problems faced to cope up with scope creep and changes to the functionality. Results: Delays in implementation date because of lot of rework. Since there were dependencies among parts of the project and the frequent changes to be incorporated, resulted many bugs in the software.
Though automated testing has a lot of benefit, but it also has some associated challenges. i. Selection of Test Tool ii. Customization of Tool iii. Selection of Automation Level iv. Development and Verification of Script v. Implementation of Test Management System
Challenges faced by the team in different phases of STLC. Testing the complete application: Is it possible? I think impossible. There are millions of test combinations. It’s not possible to test each and every combination both in manual as well as in automation testing. If you try all these combinations you will never release the product.
Misunderstanding of company processes: Some times you just don’t pay proper attention what the company-defined processes are and these are for what purposes. There are some myths in testers that they should only go with company processes even these processes are not applicable for their current testing scenario. This results in incomplete and inappropriate application testing.
Relationship with developers: Big challenge. Requires very skilled tester to handle this relation positively and even by completing the work in testers way. There are simply hundreds of excuses developers or testers can make when they are not agree with some points. For this tester also requires good communication, troubleshooting and analyzing skill.
Regression testing: When project goes on expanding the regression testing work simply becomes uncontrolled. Pressure to handle the current functionality changes, previous working functionality checks and bug tracking.
Testing always under time constraint: Hey tester, we want to ship this product by this weekend, are you ready for completion? When this order comes from boss, tester simply focuses on task completion and not on the test coverage and quality of work. There is huge list of tasks that you need to complete within specified time. This includes writing, executing, automating and reviewing the test cases.
Which tests to execute first? Then how will you take decision which test cases should be executed and with what priority? Which tests are important over others? This requires good experience to work under pressure.
Understanding the requirements: Some times testers are responsible for communicating with customers for understanding the requirements. What if tester fails to understand the requirements? Will tester be able to test the application properly? Definitely No! Testers require good listening and understanding capabilities.
Decision to stop the testing: When to stop testing? Very difficult decision. Requires core judgment of testing processes and importance of each process. Also requires ‘on the fly’ decision ability.
One test team under multiple projects: Challenging to keep track of each task. Communication challenges. Many times results in failure of one or both the projects.
Reuse of Test scripts: Application development methods are changing rapidly, making it difficult to manage the test tools and test scripts. Test script migration or reuse is very essential but difficult task.
Testers focusing on finding easy bugs: If organization is rewarding testers based on number of bugs (very bad approach to judge testers performance) then some testers only concentrate on finding easy bugs those don’t require deep understanding and testing. A hard or subtle bug remains unnoticed in such testing approach.
To cope with attrition: Increasing salaries and benefits making many employees leave the company at very short career intervals. Managements are facing hard problems to cope with attrition rate. Challenges – New testers require project training from the beginning, complex projects are difficult to understand, delay in shipping date!
Different types of testing are required throughout the life cycle of a DWH implementation. So we have different challenges to face during the different phases of STLC.
ETL (Business Functionality Data Quality Performance) During the ETL phase of DWH implementation, Data quality testing is of utmost importance. Any defect slippage in this phase will be very costly to rectify later. Functional testing need to be carried out to validate the Transformation Logic.
Data Load (Parameters Settings Validation) During the setup of Data Load functionality, specific testing on the load module is carried out. The Parameters and Settings for data load are tested here.
Initial Data Load (Perfomance Data Quality) Initial Data Load is when the underlying databases are loaded for the first time. Performance testing is of significance here. Data Quality, once tested and signed off during the ETL testing phase is re-tested here.
E2E Business Testing (UI & Interface Testing) Once the initial data load is done, the Data warehouse is ready for an end-to-end functional validation. UI testing and Interface testing are carried out during this phase.
Maintenance / Data Feeds (Regression) Data from the operational Database should be input into the Data warehouse periodically. During such periodic updates, regressing testing should be executed. This ensures the new data updates heve not broken any existing functionality. Periodic updates are required to ensure temporal consistency.
What tools are available and used for testing DWH at different stages? ETL software can help you in automating such process of data loading from Operational environment to Data Warehouse environment.
What tools are available and used for testing DWH at different stages?
What tools are available and used for testing DWH at different stages? Create pairs of SQL queries (QueryPairs) and reusable queries (Query Snippets) to embed in queries.
What tools are available and used for testing DWH at different stages? Execute Scenarios that compare Source databases and / or files to Target data warehouses.
What tools are available and used for testing DWH at different stages? Agents execute your queries and return the results to the QuerySurge server for reporting and analysis. Analyze and drill down into your results and identify bad data and data defects with our robust reporting.
Issue: Missing Data Description: Data that does not make it into the target database Possible Causes: By invalid or incorrect lookup table in the transformation logic Bad data from the source database (Needs cleansing) Invalid joins Example(s): Lookup table should contain a field value of “High” which maps to “Critical”. However, Source data field contains “Hig” - missing the h and fails the lookup, resulting in the target data field containing null. If this occurs on a key field, a possible join would be missed and the entire row could fall out.
Issue: Truncation of Data Description: Data being lost by truncation of the data field Possible Causes: Invalid field lengths on target database Transformation logic not taking into account field lengths from source Example(s): Source field value “New Mexico City” is being truncated to “New Mexico C” since the source data field did not have the correct length to capture the entire field.
Issue: Data Type Mismatch Description: Data types not setup correct on target database Possible Causes: Source data field not configured correctly Example(s): Source data field was required to be a date, however, when initially configured, was setup as a VarChar.
Issue: Null Translation Description: Null source values not being transformed to correct target values Possible Causes: Development team did not include the null translation in the transformation logic Example(s): A Source data field for null was supposed to be transformed to ‘None’ in the target data field. However, the logic was not implemented, resulting in the target data field containing null values.
Issue: Wrong Translation Description: Opposite of the Null Translation error. Field should be null but is populated with a non-null value or field should be populated but with wrong value Possible Causes: Development team incorrectly translated the source field for certain values Example(s): Ex. 1) Target field should only be populated when the source field contains certain values, otherwise should be set to null Ex. 2) Target field should be “Odd” if the source value is an odd number but target field is “Even” (This is a very basic example)
Issue: Misplaced Data Description: Source data fields not being transformed to the correct target data field Possible Causes: Development team inadvertently mapped the source data field to the wrong target data field Example(s): A source data field was supposed to be transformed to target data field ‘Last_Name’. However, the development team inadvertently mapped the source data field to ‘First_Name’
Issue: Extra Records Description: Records which should not be in the ETL are included in the ETL Possible Causes: Development team did not include filter in their code Example(s): If a case has the deleted field populated, the case and any data related to the case should not be in any ETL
Issue: Not Enough Records Description: Records which should be in the ETL are not included in the ETL Possible Causes: Development team had a filter in their code which should not have been there Example(s): If a case was in a certain state, it should be ETL’d over to the data warehouse but not the data mart
Issue: Transformation Logic Errors/Holes Description: Testing sometimes can lead to finding “holes” in the transformation logic or realizing the logic is unclear Possible Causes: Development team did not take into account special cases. For example international cities that contain special language specific characters might need to be dealt with in the ETL code Example(s): Ex. 1) Most cases may fall into a certain branch of logic for a transformation but a small subset of cases (sometimes with unusual data) may not fall into any branches. How the testers code and the developers code handle these cases could be different (and possibly both end up being wrong) and the logic is changed to accommodate the cases. Ex. 2) Tester and developer have different interpretation of transformation logic, which results in having different values. This will lead to the logic being re-written to become more clear
Issue: Simple/Small Errors Description: Capitalization, spacing and other small errors Possible Causes: Development team did not add an additional space after a comma for populating the target field. Example(s): Product names on a case should be separated by a comma and then a space but target field only has it separated by a comma