Click here to Skip to main content
Click here to Skip to main content

Learn Microsoft Business intelligence step by step – Day 1

, 25 Jun 2014 CPOL
Rate this:
Please Sign up or sign in to vote.
In this article we will learn basics about DW, BI, SSIS and ETL Process

Introduction

After a huge success of our first step by step series article on MVC we are back again with our new initiative “Learn MSBI Step by Step”.

What all we will do here?

We will start with very basic stuffs like understanding what is mean by Data warehouse and business intelligence and end with creating some complex SSRS Reports.

Complete Series

  1. Day 1 
  2. Day 2
  3. Day 3

Agenda for Day 1

What is Data Warehouse?

In a very simple word it means, “It's a place where we store all of our data”.

How it is different from Database?

Usually data warehouse is also a database. The records from multiple data sources (may be some other databases)are collected and stored in Denormalized manner. It normally stores months or years of data to support historical analysis

What is Denormalization?

Normalization is a database designing technique which makes sure that there will not be any redundant data.

  • It makes our data more reliable (because there will not be any repeating data. Exiting data will be referred wherever required)
  • It makes database management easy.
  • It reduces the size of database.

Following tables are the part of normalized database.

CityId CityName
1 Mumbai
2 Kolkata
3 Delhi
CustomerId CustomerName CityId
1 Sukesh 1
2 Rajesh 1
3 Mahesh 2
4 Ganesh 1

At the end of the day Normalized database leads to multiple join conditions in select query and thus affects the performance.
For above example query will be “Select CustomerId, CustomerName, CityName from TblCustomer inner join TblCity on TblCustomer.CityId=tblCity.CityId”

Note: In real time projects there will be more number of tables involved while joining and every table contains many records.

So the solution is Denormalization. We will design the database with repeating data. We will use rare references (or No references) among data, in short, we will repeat the same data wherever required. Denormalized version of above example looks like,

CustomerId CustomerName CityName
1 Sukesh Mumbai
2 Rajesh Mumbai
3 Mahesh Kolkata
4 Ganesh Mumbai
Data warehouse summarized

What is the purpose of Data Warehouse?

  • As you can see, in the above real life example I collect different data from different places like from my bank account, from the Personal notebook (one maintained by mom containing household expenditure), from investment detail data (stored in Access database) etc.
  • Once I have collected all data, I store them in a separate excel sheet in Denormalized fashion.
  • This new excel sheet (we may refer it as Data warehouse) can be used to make my decisions about my expenses and incomes.

In simple words,Data ware house make us retrieve calculated data quickly and efficiently (without having too many joins in our select query).

What is Business Intelligence?

There is a saying that Images are better than long description. Human mind understands graphical explanation more than theoretical explanation. In order to make decision, our information need to be displayed with proper presentation in terms of charts, reports, score cards etc.Initially the concept of data warehouse was all about keeping historical data.

Data warehouse is a foundation for the BI. BI is all about leveraging our existing data and converting them into information or we can say Knowledge. We use this knowledge for making decisions in company.

BI or Business Intelligence is simply a solution for

  • Collect information from multiple data sources
  • Transform that data into meaningful information
  • And finally show data to users with elegant presentation.

Introduction to Business Intelligence development studio / Sql Server Data tools

  • In the Microsoft world we will use “Business Intelligence development studio” commonly known as BIDS for this purpose.
  • It’s an IDE which will let us develop Data Analysis and Business Intelligence solutions.
  • It has special project types and tools for developing “Sql server Integration Services”, “Sql server Analysis services” and “Sql server reporting services”. We will learn each of these in a series of step by step article.
  • With the release of Sql Server 2012, BIDS was renamed to Sql Server Data tools.

Note: In this article we will not cover how to install sql server data tools. If you are having any problem in installing please click here.

Basic idea on SSIS, SSAS and SSRS

Step 1. Open Sql Server Data Tools

Step 2. Click on File >> New >> Project. A dialog box will popup similar to like this

As you can see, it lets us create three kind of project. In order to perform a complete business intelligence task we need to go up with all these three projects.

1. Integration Services – SSIS – Sql server Integration services

It will let us perform wide range of data migration tasks. It let us collect data from various data sources and store them into central location.

2. Analysis Services – SSAS –Sql Server Analysis services

It will let us analyze the data

3. Reporting Services – SSRS – Sql Server Reporting services

It will let us create reports from analyzed data and present it to end user.

How we are going to learn?

We will be doing labs. With every lab we will learn something new, we will reach to a new level in MSBI. We will start with SSIS and then move towards SSAS and finally end up with SSRS. In every article we will include one or more labs.

Start with SSIS

SSIS basically performs three basic things,

  1. Collect data from various sources. – We call it Extraction(E)
  2. Data obtained from different sources may or may not be same format. So first convert all of them according to business needs - We call it Transformation(T)
  3. Load them into one big data source(mostly Data Warehouse) – We call it Load(L)

Together termed as ETL process.


Lab 1 – Simple ETL process using SSIS

Problem statement

You have Two excel files as follows,

  • You will collect data from first file (Datafile.xlsx).
  • Convert that data so that it match to second excel file format (merge Title, FirstName and LastName and call them as Name).
  • Dump final result to second file (Result.xlsx).

Step by Step Demo

Step 1. Create New Project

Click File >> New >> Project. Select Integration Services from the group. Specify some nice name. Say Ok.

It will open up SSIS designer which you will use for creating and maintaining Integration service packages. It looks like follows,

In the solution explorer under “SSIS packages” folder you will see one default package created with name “Package,dtsx”. If you want you can simply “rename it”or “remove it and add new one” (right click the folder and say “New SSIS Package”).

Note: Package is simply a collection of connections, control flow elements, data flow elements, event handlers, parameters etc. We will talk about each one of this as move further.

Step 2. Create Connection Manager for Excel File

2.1 Right click Connection Manager and Say New Connection.

2.2 Select Excel from the popup and click on Add.

2.3 Click the browse button and select the excel file and click on OK.

Step 3. Rename Connection Managers

3.1 Right click the connection manager just added and rename it to SourceExcelManager.

Step 4. Create Destination connection manager

4.1 Follow the Step no 3 and create one more connection manager pointing to Result.xlsx file.

4.2 Rename connection manager to ExcelDestinationManager.

Step 5. Create control flow – Pass data from Source Excel to Destination Excel.

5.1 Make sure control flow tab is selected in SSIS designer.

5.2 Select data flow task from the toolbox and drag it into designer.

5.3 Rename Data Flow Task to “Source excel to Destination excel transfer task”

Control Flow

Control flow will be used to define the workflow. As the name implies it control the flow of execution.

  • What all tasks need to be executed?
  • What will be the sequence?
  • Whether tasks need to be executed in loops or not?
    Etc.
Step 6. CreateData Flow.

Data Flow

Data Flow defines the flow of data between source and destination.

6.1 Double click the control flow created in last step.

6.2 It will take you to second tab that is Data Flow tab.

Step 7. Create Excel Source

7.1 Now you will see a different SSIS toolbox all together. In toolbox you will see couple of groups defined like Sources, Destinations and Transformations.

Take excel source from Source group and place it in designer.

Step 8. Configure Excel Source

8.1 Red Cross mark on excel source indicates that, it’s not configured yet. Double click the excel source. It will show up a dialog box something like this.

8.2 Select Data Source as "SourceExcelManager", Data Access Mode as "Table or View" and Name of the sheet as "DataSheet1" (Name of the sheet in the excel file).

Note: This Excel Source will perform the Extraction Task (E) in the ETL process

Step 9. Create Derived Column

9.1 From the SSIS toolbox from Transformation group drag Derived column to SSIS designer.

Step 10. Connect Source to Derived Column

10.1 Click the Excel source added in prior step.

You can see a small blue arrow attached to the source. We call it “Data Flow Path”.

Data Flow Path: It lets you define how data will flow.

Click on the blue arrow and connect it to Derived Column.

Note: we will speak about the red arrow in one of the future article in the series.

Step 11. Configure the derived column

11.1 Double click the derived column. Popup looks like follows.

11.2 Put down Derived Column Name as Name, Select Derived Column as and expression as Title + “ “ + FirstName + “ “ + LastName
11.3 Click Ok.

Note: This DerivedColumn will perform the Transformation Task (T) in the ETL process

Step 12. Create Excel Destination

12.1 Add Excel Destination from the Destination group in SSIS toolbox.

Note: This Excel Destination will perform the Load Task (L) in the ETL process

Step 13. Connect Derived Column to Excel Destination

13.1 Just like step no 10, connect derived column to excel destination.

Step 14. Configure Excel Destination

14.1 Double click the Excel destination, popup looks like follow.

14.2 Set connection Manager to “ExcelConnectionManager”, Data Access mode to “Table or View” and Name of the excel sheet to “Datasheet1”.

14.3 Click on mapping and make sure its proper, if not make sure to do it before proceeding.

Note: In our case, mapping will be already done by the IDE itself (because names of columns are matching).
14.4 Click ok.

Step 15. Execute package

15.1 Press F5.

On successful execution you will get a screen something like this.

16.2 Open the Result.xlsx file and confirm the output.

Time to celebrate

We have successfully completed our very first lab on SSIS.

What next?

In further coming articles we will speak more about SSIS.

  • We will cover some more Control Flow and Data Flow items.
  • Will talk about Error handling.
  • Will talk about Variables and expressions.
  • Will talk about Deployment and debugging.
  • Many more things.

Hope you enjoyed reading this. Your comments, votes and suggestions motivates us for writing more stuffs like this.

For technical trainings on various topics like WCF, MVC, Business Intelligence, Design Patterns, WPF and UML and many more feel free to contact SukeshMarla@Gmail.com or visit www.sukesh-marla.com

For more stuff like this click here. Subscribe to article updates or follow at twitter @SukeshMarla

For 600+ videos on MSBI, .NET, SharePoint, Architecture, SQL, WPF, WCF, MVC, ASP.NET etc click @ www.questpond.com

Sample Video on SSIS

<object height="320px" type="video/x-ms-wmv" width="240px"></object>

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Marla Sukesh
Technical Lead ShawMan Softwares
India India
Learning is fun but teaching is awesome.
 
Code re-usability is my passion ,Teaching and learning is my hobby, Becoming an successful entrepreneur is my goal.
 
By professional I am a Corporate Trainer.
I do trainings on WCF, MVC, Business Intelligence, Design Patterns, HTML 5, jQuery, JSON and many more Microsoft and non-Micrsoft technologiees.

Find my profile here

 

My sites

 

@Twitter
@Facebook
Follow on   Twitter   Google+   LinkedIn

Comments and Discussions

 
QuestionThank You For this tutorial. PinmemberEr.Manish Pandey27-Nov-14 19:32 
QuestionEXCEL problem with 64 bit OS PinmemberPaolo Turrini24-Nov-14 22:11 
QuestionDoes VS for BI generates any code PinmemberRyszard505-Sep-14 2:01 
Suggestionsome statements are too general PinprofessionalChris87526-Jun-14 23:24 
GeneralRe: some statements are too general PinprofessionalMarla Sukesh27-Jun-14 0:23 
GeneralMy vote of 3 PinprofessionalMB Seifollahi8-Jun-14 21:22 
GeneralRe: My vote of 3 PinprofessionalChris87526-Jun-14 23:00 
GeneralRe: My vote of 3 PinprofessionalMB Seifollahi26-Jun-14 23:10 
GeneralRe: My vote of 3 PinprofessionalChris87526-Jun-14 23:42 
GeneralRe: My vote of 3 PinprofessionalMB Seifollahi26-Jun-14 23:52 
GeneralRe: My vote of 3 PinprofessionalMarla Sukesh27-Jun-14 1:09 
GeneralGood article PinmemberSourav Dey30-May-14 2:55 
GeneralRe: Good article PinprofessionalMarla Sukesh30-May-14 3:07 
BugError in execution of package PinmemberMember 1084702527-May-14 8:46 
GeneralRe: Error in execution of package PinprofessionalMarla Sukesh27-May-14 8:49 
Questiongood article Pinmemberpradeep shet15-May-14 22:52 
AnswerRe: good article PinprofessionalMarla Sukesh15-May-14 23:12 
GeneralMy vote of 5 PinmemberMember 41450883-Apr-14 20:48 
GeneralRe: My vote of 5 PinprofessionalMarla Sukesh4-Apr-14 11:07 
QuestionGood Article PinmemberDeepika Sohani3-Apr-14 20:32 
AnswerRe: Good Article PinprofessionalMarla Sukesh4-Apr-14 11:06 
GeneralGr8 work...its really easy to understand Pinmemberpradeep.shet3-Apr-14 10:09 
GeneralRe: Gr8 work...its really easy to understand PinprofessionalMarla Sukesh3-Apr-14 18:44 
GeneralMy vote of 5 Pinmembermanoj.jsm3-Apr-14 0:20 
GeneralRe: My vote of 5 PinprofessionalMarla Sukesh3-Apr-14 8:30 
Questionvery helpful PinmemberKadev2-Apr-14 0:24 
AnswerRe: very helpful PinprofessionalMarla Sukesh2-Apr-14 0:33 
GeneralMy vote of 5 PinmemberMahesh Sanika1-Apr-14 21:58 
Excellent
GeneralRe: My vote of 5 PinprofessionalMarla Sukesh2-Apr-14 0:33 
GeneralMy vote of 5 PinmemberRajesh Pillai29-Mar-14 6:58 
GeneralRe: My vote of 5 PinprofessionalMarla Sukesh29-Mar-14 7:12 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.141220.1 | Last Updated 25 Jun 2014
Article Copyright 2014 by Marla Sukesh
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid