Top ETL Testing Interview Questions And Answers

  1. Explain what is ETL?

ETL is the process of transferring data from source database to the sink data warehouse. In this process, there are 3 different processes like E for Extract the data, T for Transform the data and L for Load the data. The data is extracted from the source database in the extraction process which is then transformed into the required format and then loaded to the sink data warehouse.

  1. Mention what are the types of data warehouse applications and what is the difference between data mining and data warehousing?

The types of data warehouse applications are:

  • Info Processing
  • Analytical Processing
  • Data Mining

Data mining can be define as the process of extracting hidden predictive information from huge databases and interpret the data while data warehousing may make use of a data mine for analytical processing of the data in a faster way. Data warehousing is the way of aggregating data from multiple sources into one common repository.

 

  1. What is fact? What are the types of facts?

Fact is a central component of a multi-dimensional model which contains the measures to be examined. Facts are related to dimensions.

Types of facts are:

  • Additive Facts
  • Non-additive Facts
  • Semi-additive Facts
  1. Explain what are Cubes and OLAP Cubes?

Cubes are data handling units included of fact tables and dimensions from the data warehouse. It provides multi-dimensional analysis.

OLAP stands for Online Analytics Processing, and OLAP cube stores large data in multi-dimensional form for reporting purposes. It consists of facts called as measures categorized by dimensions.

  1. Explain what is tracing level and what are the types?

Tracing level is the amount of data placed in the log files. Tracing level can be classified in two types Normal and Verbose. Normal level describes the tracing level in a comprehensive manner while verbose explains the tracing levels at each and every row.

  1. Explain what fact less fact schema is and what is Measures?

A fact table without measures is known as Fact less fact table. It can view the number of occurring events. For example, it is used to record an event such as employee count in a company. The numeric data is based on columns in a fact table is called as Measures

  1. Explain what is transformation?

A transformation is a repository object which produces, modifies or passes data. Transformations are of two types:

  • Active
  • Passive
  1. Explain the use of Lookup Transformation?

The Lookup Transformation is used for:

  • Getting a related value from a table using a column value
  • Update slowly changing dimension table
  • Verify whether records already exist in the table
  1. Mention what is the advantage of using Data Reader Destination Adapter?

The advantage of using the Data Reader Destination Adapter is that it populates an ADO record set in memory and exposes the data from the Data Flow task by implementing the Data Reader interface, so that other application can consume the data.

  1. Using SSIS (SQL Server Integration Service) what are the possible ways to update table?

To update table using SSIS the possible ways are:

  • Use a SQL command
  • Use Cache
  • Use a staging table
  • Use the Script Task
  • Use full database name for updating if MSSQL is used
  1. In case you have non-OLEDB (Object Linking and Embedding Database) source for the lookup what would you do?

In case if you have non-OLEBD source for the lookup then you have to use Cache to load data and use it as source

  1. In what case do you use dynamic cache and static cache in connected and unconnected transformations?

Dynamic cache is used when you have to update master table and slowly changing dimensions (SCD). For flat files Static cache is used

  1. What is Bus Schema?

For the various business processes to identify the common dimensions, BUS schema is used. It comes with a conformed dimension along with a standardized definition of information

  1. Explain what is data source view?

A data source view defines the relational schema which will be used in the analysis services databases. Rather than directly from data source objects, dimensions and cubes are created from data source views.

  1. How you can extract SAP data using Informatica?

With the power connect option you extract SAP data using informatica

  • Install and configure the Power Connect tool
  • Import the source into the Source Analyzer. Between Informatica and SAP Power connect act as a gateway. The next step is to generate the ABAP code for the mapping then only informatica can pull data from SAP
  • To connect and import sources from external systems Power Connect is used
  1. Explain what is data purging?

Data purging is a process of removing data from data warehouse. It deletes junk data’s like rows with null values or extra spaces.

  1. Explain what are Schema Objects?

Schema objects are the logical structure that directly refers to the databases data. A schema object includes tables, sequence synonyms, views, indexes, clusters, functions packages and database links.

Add a Comment

Your email address will not be published. Required fields are marked *