Click here to Skip to main content
Click here to Skip to main content
Go to top

Implementation of Slow Changing Dimension (SCD) using SSIS

, 26 Aug 2014
Rate this:
Please Sign up or sign in to vote.
SCD plays an important role where the user wants to maintain the historical data and to capture the changes over the period time.

Introduction

In Data warehouse, slow changing dimension (SCD) plays an important role where the user wants to maintain the historical data and to capture the changes over the period time. SCD is useful when we need to keep track of changing data. Let’s say- if User has updated its details, then using SCD, it’s easy to find out the historical data of the user.

In this article, we are going to discuss about different types of SCD and implementation using SSIS.

Implementation

Types of SCD:

  • type1 (Fixed): The data should be fixed. This type does not accept changes.
  • type2 (Changing): It keeps the most recent values in the target. It does not maintain the history.
  • type3 (Historical): It keeps the full history in the target database. For every update in the source a new record is inserted in the target.

So depending upon the business, we need to decide which type of SCD is suitable for our Data warehouse and hence can be implemented through SSIS

For implementation of slow changing dimension, source and target DB. Source DB will be something from where the data keeps changing and target would be DB where we need to keep the changing/historical data like Data warehouse.

Let’s go through the below example to understand the concept of SCD.

  1. Create Table in source Table – “EmployeeDetails”
    CREATE TABLE [test].[EmployeeDetails](
     
           [EmployeeId] INT PRIMARY KEY,
              [EmployeeName] [VARCHAR](50) NULL,
              [EMPAddress] [VARCHAR](50) NULL,
              [EMPPhone] [VARCHAR](50) NULL,
              [EMPMailId] [VARCHAR](50) NULL,
     
    ) ON [PRIMARY]
  2. Insert values in the table in the EmployeeDetails.
    INSERT INTO [test].[EmployeeDetails] ([EmployeeId], [EmployeeName], [EMPAddress], [EMPPhone], [EMPMailId])
    
    VALUES
    
    ('12345', 'Mr. XYZ', 'Street 654, London', '7873432323', 'xyz_abc@yahoo.com'),
    
    ('56789', 'Mr. ABC', 'Street 134, NY', '9873432323', 'hij_pqr@yahoo.com')
    
    GO

    Create table on the target DB - DimEmployeeDetails i.e. in our case, it’s a data warehouse. We will be creating extra column as compared to source table. Below is the explanation for the same-

    • EffectiveDate- To know the last updated date.
    • ExpiredDate – To know the last changed date when the value got changed to some other value.
    • CurrentFlag- To know the current active value.
    CREATE TABLE [test].[DimEmployeeDetails](
     
           [Id] [INT] IDENTITY(1,1) NOT NULL,
              [EmployeeId] INT,
              [EmployeeName] [VARCHAR](50) NULL,
              [EMPAddress] [VARCHAR](50) NULL,
              [EMPPhone] [VARCHAR](50) NULL,
              [EMPMailId] [VARCHAR](50) NULL,
              [EffectiveDate] [date] NULL,
              [ExpirationDate] [date] NULL,
              [CurrentFlag] [char](1) NULL,
              CONSTRAINT [PK_DimEmployeeDetails] PRIMARY KEY CLUSTERED ([Id] ASC)
     
    ) ON [PRIMARY]
  3. Open Visual studio, Create new Integration Services project under Business Intelligence template. Give the name of project to some useful name.

  4. Create OLEDB connection Manager and point it to your Source and target DB. In this example, I have kept both the DB on same data source.

    Once the connection Manager created, drag Data Flow task to the control flow. Rename it to “Employee_SCD”

  5. Double click data flow “Employee_SCD” and you will be redirected to data flow page. Drag OLE DB source from the SSIS toolbox. Configure it with the source table i.e. EmployeeDetails.

  6. Now, drag Slowly Changing Dimension from the SSIS toolbox to data flow and connect the OLE DB source to the Slowly changing dimension transformation.

    1. To configure the slowly changing Dimension transformation, double click and wizard will open. Click Next.

    2. One next page of the wizard, select the target table i.e. DimEmployeeDetails. It will display all the columns. For configuration of SCD, we need to declare surrogate key or business key. This will act as unique key. In our example, we select “EMPMailId” as Business Key. Click next.

    3. On the next page of Wizard, we need to configure all the columns where we need to maintain the historical data. There are two types of Change type – Historical attribute, Fixed attribute and changing attribute. Historical attribute is for capturing the historical data i.e. this is of type 3, Changing Attribute is for Type 2 which means if the data gets changed, and it should overwrite the old data with new one. Fixed Attribute is for column which is fixed.

    4. On next page of SCD wizard, we need to select the columns for start and end date. Please select as shown below.

    5. Click next.

    6. Click finish.

    7. One the SCD wizard is closed, the data flow will look like below image. SCD will automatically create other transformation based on the details given in the SCD wizard.

  7. Execute the package. Once completed, there shouldn’t be any error.

  8. Open, SQL Server Management Studio, and select both the tables. We can see that both is having the same data because using SCD, the same is being copied to the target table i.e. DimEmployeeDetails.

  9. Now update the source table.
    UPDATE [test].[EmployeeDetails]
    
    SET [EMPAddress] = 'New Hills, Londona'
    
    WHERE [EmployeeId] = '12345'

    Select the both the table. Here, source table entry is different from target table for Employee 12345.

  10. Now, Execute the SCD once again.

  11. Check the Database table. As we have configured the SCD type3, historical data is being maintained. For EmployeeId 12345, there are two entries, one of old and other for new entry. The first entry will have ExpirationDate value. Hence by this way, we can configure the SCD Type 2 for maintaining the historical data.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Arvind Kumar Chaubey
Technical Lead
India India
I am a Senior Consultant in one of the MNC. 7 years of experience on Microsoft Technologies and expertise into MSBI, DW and SQL.
Follow on   LinkedIn

Comments and Discussions

 
-- There are no messages in this forum --
| Advertise | Privacy | Mobile
Web04 | 2.8.140916.1 | Last Updated 26 Aug 2014
Article Copyright 2014 by Arvind Kumar Chaubey
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid