Click here to Skip to main content
15,868,141 members
Articles / SSIS

Using REST API in SSIS to Extract Top 100 User Tweets

Rate me:
Please Sign up or sign in to vote.
4.00/5 (1 vote)
13 Feb 2013CPOL3 min read 55.3K   1   7
This post will show you how to use REST API in SSIS to extract the top 100 user tweets

Abstract

The growing popularity of the use of social networks by businesses indicate a possible change in ETL requirements. Part of this change is that developers had to revise existing ETL model such that it caters to the sourcing of data off the internet in addition to the traditional flat files or OLE/ODBC Connections. In this article, I will demonstrate extracting user tweets using Twitter’s REST API inside an HTTP Connection Manager.

Requirements

  • Microsoft Visual Studio 2005 or later
  • SQL Server 2005 or later
  • Microsoft .NET Framework 3.5 or later
  • Tweeter User Account that has posted at least 100 tweets
  • An Active Internet Connection

Article

Let’s begin by launching Microsoft Visual Studio and create a new Integration Services Project which is located under Business Intelligence Projects category. (If you are using SQL Server 2012 or later, you will have to launch SQL Server Data Tools – SSDT).

After you have assigned a project name, proceed to click and drag the Data Flow Task (DFT) into Control Flow pane from toolbox (SSIS Toolbox in SSDT). I decided to name my DFT as DFT – Retrieve Tweets.

Add the following two connections:

  • HTTP Connection Manager
  • OLE DB Connection

I have configured my HTTP Connection as follows:

  • Server URL = http://api.twitter.com/1/statuses/user_timeline.rss?screen_name=mafiswana&count=100

Note that you can replace the value of parameter screen_name with your twitter user account instead of mafiswana.

I have configured my OLE DB Connection as follows:

  • Server = Localhost
  • Database Name = selectSifiso

So far, the package should look as below:

Image 1

Let’s go back and edit the Data Flow Task.

Add a Script Component (SC) which is located under Data Flow Transformations.

Select Source as a script component type.

Edit the script SC and click on the Connection Managers tab on the left of the Script Transformation Editor. Under the Connection Manager column, click to select the HTTP Connection Manager (it should be the only connection available).

Let’s move on to the Inputs and Outputs tab in the Script Transformation Editor. In here, we will configure the source output columns.

Collapse Output 0 node and click on Output Columns. Click Add Column and the following columns with properties configured as below:

  • Name = sS_Tweets; DataType = string [DT_STR]; Length = 200
  • Name = sS_TwitterDate; DataType = string [DT_STR]; Length = 200

Now let’s move on to the Script tab in the Script Transformation Editor.

In Object Explorer, right click on References and Click Add. Under .NET tab, add the following references:

  • System.ServiceModel
  • System.ServiceModel.Web

If one or none of above references are not available for selection, please ensure that you have .NET 3.5 Framework selected. You can do that by clicking on Properties – Application – Target Framework – .NET Framework 3.5

In addition to the already referenced namespaces, add the following namespaces:

  • using System.Xml;
  • using System.ServiceModel.Syndication;

Declare the following variables in the public class ScriptMain : UserComponent 

  • private SyndicationFeed sS_Tweets = null; 
  • private XmlReader sS_XmlReader = null;

Under the PreExecute() method, add the following:

  • sS_XmlReader = XmlReader.Create(Connections.Connection.ConnectionString);
  • sS_Tweets = SyndicationFeed.Load(sS_XmlReader);

Under the public override void CreateNewOutputRows() method, add the following:

C#
if (sS_Tweets != null)
{
foreach (var item in sS_Tweets.Items)
{
Output0Buffer.AddRow();
Output0Buffer.sS_Tweets= item.Title.Text;
Output0Buffer.sS_TwitterDate= item.PublishDate.ToString();
}
Output0Buffer.SetEndOfRowset();
}

The complete script code can be found here.

Now that we have configured the Script Component as source, let us add an OLE DB Destination Component (ODD) and connect Script Component to the Destination Component.

I have configured the OLE DB connection manager of the ODD to use the selectSifiso connection. The data access mode is a Table or view – fast load. Under Name of the table or the view, click New and create and output table. Click on Mappings tab and ensure that the source-to-destination mappings are correct.

Your complete package data flow task should look as follows:

Image 2

Conclusion

All that is left to do is run the package and your tweets will be extracted and stored in SQL Server.

It’s that simple.

Till next time folks, cheers!

Sifiso

This article was originally posted at http://www.selectsifiso.net

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Technical Lead select SIFISO
South Africa South Africa
Sifiso W. Ndlovu is a certified Microsoft professional who holds a Master’s degree in IT Management from the University of Johannesburg. He specializes on a range of enterprise and consumer technologies using open source and proprietary software. He is the member of the Johannesburg SQL User Group wherein he has made several presentations on User Group Meetings and SQL Saturday sessions. He has written for a number of publications including SQLShack.com and SQLServerCentral.com.

Comments and Discussions

 
QuestionWhat Object Explorer? Pin
Member 1233717521-May-18 8:44
Member 1233717521-May-18 8:44 
QuestionWhat do you mean by Collapse the node? Pin
Member 1233717521-May-18 8:37
Member 1233717521-May-18 8:37 
QuestionCan we handle JSON with this? Pin
Member 137677487-Apr-18 11:56
Member 137677487-Apr-18 11:56 
Questionlink broken Pin
Member 136111557-Jan-18 12:48
Member 136111557-Jan-18 12:48 
GeneralGreat! Pin
BI Road Warrior28-Dec-17 11:09
BI Road Warrior28-Dec-17 11:09 
QuestionUpdate Twitter API? Pin
scafrithuric23-Jan-15 3:39
scafrithuric23-Jan-15 3:39 
AnswerRe: Update Twitter API? Pin
SQLDaddy122-Oct-15 18:44
SQLDaddy122-Oct-15 18:44 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.