Click here to Skip to main content
11,930,830 members (56,646 online)
Click here to Skip to main content
Add your own
alternative version

Tagged as


4 bookmarked

SSIS: Transform data using Regex

, 21 Jan 2013 BSD
Rate this:
Please Sign up or sign in to vote.
Transform data by using regex and Script Component.


In this tutorial I will show how to transform data using a Script Component within a Data Flow Task and regex. My source data contains customer name, and I need to separate it to first, middle, and last names.


Regular Expressions are a powerful way for you to search for patterns in strings of text. In SSIS, we can use Regular Expressions to assist us in cleansing or transforming data.

This is my partial "Customer" reference table.

What you need

Download script and execute in SSMS.

-- Create database
create database Test
use Test
USE [Test]

-- Create customer table
if exists (select * from sys.tables where name = 'customer_destination')
drop table customer_destination

CREATE TABLE [dbo].[customer_destination](
	[customer_number] [varchar](10) NULL,
	[customer_type] [varchar](1) NULL,
	[name] [varchar](100) NULL,
	[gender] [varchar](1) NULL,
	[email_address] [varchar](200) NULL,
	[date_of_birth] [datetime] NULL,
	[occupation] [varchar](50) NULL,
	[status] [varchar](3) NULL,
	[fname] [varchar](50) NULL,
	[mname] [varchar](50) NULL,
	[lname] [varchar](50) NULL

This script was tested in SQL Server 2008.

Create Project

Open SQL Server Business Intelligence Development Studio.

Then go to File->New->Project and select Integration Service Project.

Select "Data Flow Task" from "Control Flow Items" and drag it on "Control Flow" tab. Then double click it.

Select "Excel Source" from "Data Flow Source" and drag it on "Data Flow" tab. Double click on “Excel Source” task to configure it.

Click New button for new OLE DB connection or select from existing connection.

Click Browse button to select the "Customer.xls" file from your local path.

Select the Excel sheet name.

Check the available column and click OK.

Select "Data Conversion" from "Data Flow Transformation" and drag it on the "Data Flow" tab and connect the extended green arrow from “Excel Source” to your "Data Conversion".

Convert data. The length should be the same, otherwise you will get a warning message. Add "convert" prefix in each value.

Select "Script Component" from "Data Flow Transformation" and drag it on the "Data Flow" tab. Connect the extended green arrow from “Data Conversion” to your "Script Component".

Click Input Column tab and select the "convert_name" column.

Click the Output Column tab and create three outputs: "fname", "mname", and "lname".

Click the "Script" tab and click "Edit Script". I have chosen my ScriptLanguage as "Microsoft Visual C# 2008".

In the Main.cs file we need to add a reference. We need this for Regular Expressions.

using System.Text.RegularExpressions;

In the Main.cs file it's time to write some code in the Input0_ProcessInputRow procedure.

public override void Input0_ProcessInputRow(Input0Buffer Row)
      Add your code here
    string keyVal = Row.convertname.ToString();
    Match match = Regex.Match(keyVal, @"^(?<first>\w+) (?<last>\w+)(?: (?<middle>\w+))?$");
    if (match.Success)
        string f = match.Groups["first"].Value;
        string l = match.Groups["last"].Value;
        string m = null;
        if (match.Groups["middle"].Success)
            m = match.Groups["middle"].Value;

        Row.fname = f;
        Row.lname = l;
        Row.mname = m;
        Row.fname = Row.convertname.ToString();

Select "OLE DB Destination" from "Data Flow Destination" and drag it on the "Data Flow" tab. Connect the extended green arrow from “Script Component” to your "OLE DB Destination".

Double click on the "OLE DB Destination" task to configure it.

Click New to create a new connection.

Select "Server Name", "Authentication", and "Database" which will be "Test" for this example. Click Test Connection for checking, then click OK and then again click OK.

Select the "customer_destination" table.

Click the mapping tab and map each field and click OK.

If you execute the package with debugging (press F5), the package should succeed and appear as shown here:

To check what is happening here:

SELECT [customer_number]
      ,[customer_type] as [type]
  FROM [Test].[dbo].[customer_destination]


You can see that the fname, mname, and lname fields have been populated. The .NET Framework includes full support for regular expressions (Regex), in the System.Text.RegularExpressions namespace. Regex provides an incredibly powerful way of defining and finding string patterns. You can use them for string pattern match, data cleaning, and data transformation. Email, postal code, and phone number verification are a few of the uses. I have included some Regex here, though all are not perfect, they are collected from Google.

Mail: [a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

Phone: ^[0-9+\(\)#\.\s\/ext-]+$

First character capital:- ^[A-Z][a-zA-Z0-9]+$

I hope this might be helpful to you!



None so far.


This article, along with any associated source code and files, is licensed under The BSD License


About the Author

Software Developer (Senior) icddr,b
Bangladesh Bangladesh
More than 8 years experience on Programming and Project implementation, I was primarily involved with projects for private organization,Govt.(Bangladesh Army,DG Health,RJSC), NGO (SEDF,WFP). Presently I am working at ICDDR,B and enhancing Hospital Management System developed by Microsoft Dynamic NAV and Windows Mobile Application 5.0

An active supporter of Open Source technology, my interested areas are ERP, IT Audit, Data warehouse, BI etc.

Playing Guitar for 15 years, my interested music style is Blues Rock,Neo Classical.


70-540:Microsoft® Windows Mobile® 5.0 - Application Development
MB7-514:Microsoft Dynamics™ NAV 5.0 C/SIDE Introduction
MB7-516:Microsoft Dynamics™ NAV 5.0 Solution Development
MB7-517:Microsoft Dynamics™ NAV 5.0 Installation and Configuration
MB7-515:Microsoft Dynamics™ NAV 5.0 Financials
70-432:Microsoft SQL Server 2008 - Implementation and Maintenance
70-450:PRO: Designing, Optimizing and Maintaining a Database Administrative Solution Using Microsoft SQL Server 2008
70-448:Microsoft SQL Server 2008, Business Intelligence Development and Maintenance
312-50:Certified Ethical Hacker

Web :
Blog :
linkedin :

You may also be interested in...

Comments and Discussions

-- There are no messages in this forum --
| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.151126.1 | Last Updated 21 Jan 2013
Article Copyright 2013 by mparvez
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid