ETL Testing


Data ware house is relational database which is subject oriented, integrated, time-variant and nonvolatile collection of data used to support strategic decision making process

Subject Oriented

Data warehouses are designed to help you analyze data. For example, to learn more about your company's sales data, you can build a warehouse that concentrates on sales. Using this warehouse, you can answer questions like "Who was our best customer for this item last year?" This ability to define a data warehouse by subject matter, sales in this case, makes the data warehouse subject oriented.

Integrated

Integration is closely related to subject orientation. Data warehouses must put data from disparate sources into a consistent format. They must resolve such problems as naming conflicts and inconsistencies among units of measure. When they achieve this, they are said to be integrated.

Nonvolatile

Nonvolatile means that, once entered into the warehouse, data should not change. This is logical because the purpose of a warehouse is to enable you to analyze what has occurred.

Time Variant

In order to discover trends in business, analysts need large amounts of data. This is very much in contrast to online transaction processing (OLTP) systems, where performance requirements demand that historical data be moved to an archive. A data warehouse's focus on change over time is what is meant by the term time variant.

ETL Architecture:














The above snapshot shows the pictorial representation of how a typical Data warehouse Architecture looks like.

ETL basically stands for Extract Transform Load, which simply implies the process where you extract data from Source Tables, transform them in to the desired format based on certain rules and finally load them onto Target tables


Extraction:
Extraction is the process of extracting data from a heterogeneous data source system for further use in a data warehouse environment. This is the first step of the ETL process. After the extraction, this data can be transformed based on the business requirements and loaded into the data warehouse.

There are different mechanisms to extract the data from source systems like Full Extract and Incremental Extract.


Transformation:
The transform stage applies a series of rules or transformations, to the extracted data from the source to derive the data for loading into the target.


Loading:
Once the data got transformed to the business requirement, the transformed data will be loaded into the target systems.

Now as we understood on what we meant by ETL as a process, we should also know on, different tools available in the market to perform ETL process.


Scope of ETL Testing

There is an exponentially increasing cost associated with finding software defects later in the development lifecycle. In data warehousing, this is compounded because of the additional business costs of using incorrect data to make critical business decisions. Given the importance of early detection of software defects, let's first review some general goals of testing an ETL application:
  1. Data completeness
  2. Data transformation
  3. Data quality
  4. Performance and scalability
  5. Integration testing
  6. User-acceptance testing
  7. Regression testing

ETL TESTING LIFE CYCLE


















    The ETL Test life cycle is not much when compared to the other testing life cycles. 

    Pre-requisites for ETL Testing

    We can start ETL testing when we have the below mentioned are available:
    • Access to the required folders in the network.
    • Implementation checklist for move from development to test.
    • Production configuration and data available.
    • Test plan is made based on the inputs from test estimation and business requirement.
    • Test cases and test scenarios from all the available inputs are designed.
    • All unit testing completed and summarized.
    • Test data available in test environment.
    • Migration to the test environment from the development environment

    ETL TEST PLAN:

    We all know that a test life cycle should undergo different stages as showed in the picture above.

    There is no difference in developing the test plan for ETL Testing and conventional testing test plan. The additional items which we have to include in ETL testing are Mapping Names, Session Names, Workflow names, Workflow logs, Parameter Files, SQL files, Source/Target File names, etc. in the Test Plan document. Mention the risks related to data loss, bad data truncation in target db.

    ETL TESTING CHALLENGES

    ETL testing is quite different from conventional testing. There are many challenges we faced while performing data warehouse testing. Here is the list of few ETL testing challenges I experienced on my project:

    1. Incompatible and duplicate data
    2. Loss of data during ETL process
    3. Testers have no privileges to execute ETL jobs by their own
    4. Volume and complexity of data is very huge
    5. Fault in business process and procedures
    6. Trouble acquiring and building test data

    Data is important for businesses to make the critical business decisions. ETL testing plays a significant role validating and ensuring that the business information is exact, consistent and reliable. Also, it minimizes hazard of data loss in production.


    ETL TESTING TOOLS

    We can perform DWH testing using ETL testing tools available in the market. Few of the ETL testing tools are Informatica DVO, Query Surge etc.

    For Informatica DVO details please refer to Informatica DVO section in this blog.




    Comments

    1. I got a good answer from the above description,but it still requires some more update to be made. Please share more content on MSBI Online Training

      ReplyDelete
    2. Great information...Your post the very informative i have learned some information about your blog thank you for Sharing the great information...
      software testing course in pune

      ReplyDelete
    3. This comment has been removed by the author.

      ReplyDelete

    Post a Comment

    Popular posts from this blog

    Data Migration Testing Startegy

    Informatica DVO