Implementing SCD Type 2 in Informatica Cloud (IICS) Using Inbuilt Template Mapping

Implementing SCD Type 2 in Informatica Cloud (IICS) Using Inbuilt Template Mapping

Introduction

In today’s data-driven world, maintaining the historical accuracy of data is crucial for effective business analytics. Slowly Changing Dimension (SCD) Type 2 is one of the most commonly used methods for tracking changes in dimension tables over time. In this blog post, I’ll walk you through the process of implementing SCD Type 2 in Informatica Intelligent Cloud Services (IICS) using the inbuilt template mapping. We’ll also explore how to test the implementation with sample data.

Understanding SCD Type 2

SCD Type 2 allows us to preserve the history of data by creating multiple records for each entity, with separate rows for each change. This enables us to track changes to key attributes over time, providing a comprehensive view of data evolution.

Tables Used in the Implementation

To demonstrate the SCD Type 2 implementation, we use two tables:

  1. src_customer: This is the source table where customer data is stored.
CREATE TABLE raw.src_customer (
    customer_id INT,
    customer_name STRING,
    customer_address STRING,
    order_id INT,
    create_date DATE,
    modift_date DATE
);
  1. dim_customer_scd2: This is the dimension table that follows the SCD Type 2 design. It stores the historical data along with the current data.
CREATE TABLE raw.dim_customer_scd2 (
    id INT,
    customer_id INT,
    customer_name STRING,
    customer_address STRING,
    order_id INT,
    crc_num INT,
    current_flag STRING,
    valid_from DATE,
    valid_to DATE
);
  • crc_num: Cyclic Redundancy Check (CRC) number used to detect changes in the data.
  • current_flag: A flag to indicate the current record (Y for current, N for historical).
  • valid_from and valid_to: Date fields to store the period during which a record was valid.

Steps to Implement SCD Type 2 in IICS

  1. Create the Source and Target Tables: As defined above, create the src_customer and dim_customer_scd2 tables in your database.
  2. Populate the Source Table: Insert sample data into the src_customer table. This data will be used to test the SCD Type 2 implementation.
INSERT INTO raw.src_customer (customer_id, customer_name, customer_address, order_id, create_date, modift_date) VALUES
(101, 'John Doe', '123 Elm St',1001,'2024-08-30','2024-08-30'),
(102, 'Jane Smith', '456 Oak St',1002,'2024-08-30','2024-08-30'),
(103, 'Alice Brown', '789 Pine St',1003,'2024-08-30','2024-08-30');

3.Configure the IICS Mapping:

  • Use the inbuilt template mapping in IICS for SCD Type 2. The mapping should include the following:
    • Source Transformation: Reading data from the src_customer table.
    • Expression Transformation: To calculate the CRC and other necessary fields.
    • Lookup Transformation: To check if the record exists in the dim_customer_scd2 table.
    • Router Transformation: To route new and updated records accordingly.
    • Update Strategy Transformation: To insert new records and update existing records with historical data.
    • Target Transformation: To load data into the dim_customer_scd2 table.

4. Test the Implementation: Once the mapping is configured, run the workflow and verify the results by querying the dim_customer_scd2 table.

Testing the SCD Type 2 Implementation

Let’s test the implementation by inserting new customers and updating existing ones in the src_customer table.

  1. Insert New Customers:
INSERT INTO raw.src_customer (customer_id, customer_name, customer_address, order_id, create_date, modift_date) VALUES
(104, 'Alice Green', '321 Cedar St',1004,'2024-08-31','2024-08-31'),
(105, 'Alex', '234 London',1005,'2024-08-31','2024-08-31');

2. Update an Existing Customer:

UPDATE raw.src_customer SET customer_address = '999 Maple St', modift_date = '2024-08-31' WHERE customer_id = 102;

After running the updated data through the SCD Type 2 mapping in IICS, the dim_customer_scd2 table should now reflect the new customer data and the updated address for customer ID 102, with the previous address stored as a historical record.

Watch Video Tutorial:

Conclusion

Implementing SCD Type 2 in IICS using the inbuilt template mapping is a powerful way to track changes in your dimension tables. By following the steps outlined in this blog, you can easily manage historical data and ensure accurate, up-to-date information in your data warehouse. This approach is essential for businesses looking to maintain a comprehensive view of their data over time.

2 Comments

  1. Raj

    Hi @ Regu, Awesome content on SCD type 2. Very Crisp and clear. I tired this and when ever I rerun or update a record I am getting the same set of records added to the table again. What could be the error ? How can I contact u for queries ?

    • Hi Raj, Thanks for trying this. Please feel free to contact me on WhatsApp I have shared my number to your Yahoo email.

Leave a Reply

Your email address will not be published. Required fields are marked *