Click here to Skip to main content
15,867,686 members
Articles / Programming Languages / C#
Article

WebResourceProvider goes .NET

Rate me:
Please Sign up or sign in to vote.
4.31/5 (23 votes)
17 Feb 2008CPOL2 min read 97.8K   2.3K   85   22
A a simple yet powerful framework for retrieving useful information from public sources on the web.

Introduction

This article (an improved .NET version of my C++ version) describes WebResourceProvider, a simple yet powerful framework for retrieving useful information from public sources on the web, such as:

Apart from being rewritten in C# from the ground up, this .NET version offers a smaller footprint than its C++ predecessor and makes it easy (almost trivial) to encapsulate functionality offered by online services into an object that can be manipulated by your application. Here are screenshots of some WebResourceProvider applications (listed at the end of this article) in action.

A domain walker that discovers the topology of the world wide web.
Domain Walker
An object that encapsulates Google's online natural language translation tools.
Google Translator
RSSChannel and RSSItem - a pair of objects that allow you to build an RSS reader.
Simple RSS Reader

A Word of Caution

Before you use WebResourceProvider to write the next killer app, be aware that there are legal and ethical issues regarding the use of information obtained from other sources. In particular, the terms of service (TOS) of content providers such as Yahoo, CNN, etc. clearly state what you can and cannot do with information retrieved from their sites. Even if you write a web resource provider for personal use only, you should take into consideration any undue stress that your object may put on a web server. The object's Pause property allows you to inject a delay between successive HTTP requests to help prevent overloading a server.

How it Works

WebResourceProvider works by initializing itself, constructing a URL to be retrieved, downloading the resource, and extracting useful information from the downloaded content. This process repeats until no more data needs to be downloaded.

You use WebResourceProvider by deriving from it, overriding getFetchUrl() and optionally overriding any of these virtual methods (shown in red in the flowchart on the right):

  • init()
  • getPostData()
  • parseContent()
  • continueFetching()

In the spirit of true object orientation, WebResourceProvider (unlike its C++ predecessor) doesn't provide a facility for parsing downloaded content. Readers are instead urged to use my StringParser class to help perform this task.

WebResourceProvider exposes the following properties:

Image 4 Agent Gets and sets the user agent string.
Image 5 Content Gets the retrieved content.
Image 6 ErrorMsg Gets the last error message, if any.
Image 7 FetchTime Gets the fetch timestamp.
Image 8 Pause Gets and sets the minimum pause time interval (in mSec).
Image 9 Referer Gets and sets the referer string.
Image 10 Timeout Gets and sets the timeout (in mSec).

WebResourceProvider control flow

Demo applications

C# applications (with full source code) that use WebResourceProvider can be found here:

Revision History

  • 17 Feb 2008
    Updated links to sample applications.
  • 15 Jan 2006
    Initial version.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Technical Lead
Canada Canada
Ravi Bhavnani is an ardent fan of Microsoft technologies who loves building Windows apps, especially PIMs, system utilities, and things that go bump on the Internet. During his career, Ravi has developed expert systems, desktop imaging apps, marketing automation software, EDA tools, a platform to help people find, analyze and understand information, trading software for institutional investors and advanced data visualization solutions. He currently works for a company that provides enterprise workforce management solutions to large clients.

His interests include the .NET framework, reasoning systems, financial analysis and algorithmic trading, NLP, HCI and UI design. Ravi holds a BS in Physics and Math and an MS in Computer Science and was a Microsoft MVP (C++ and C# in 2006 and 2007). He is also the co-inventor of 3 patents on software security and generating data visualization dashboards. His claim to fame is that he crafted CodeProject's "joke" forum post icon.

Ravi's biggest fear is that one day he might actually get a life, although the chances of that happening seem extremely remote.

Comments and Discussions

 
QuestionLanguage transliteration using ASP.NET for a website Pin
Lekhulal Mathalipara14-Dec-12 18:25
Lekhulal Mathalipara14-Dec-12 18:25 
AnswerRe: Language transliteration using ASP.NET for a website Pin
Ravi Bhavnani15-Dec-12 3:32
professionalRavi Bhavnani15-Dec-12 3:32 
GeneralUTF-7 Encoding Pin
varandas794-Nov-08 3:07
varandas794-Nov-08 3:07 
GeneralRe: UTF-7 Encoding Pin
Ravi Bhavnani27-Nov-08 11:50
professionalRavi Bhavnani27-Nov-08 11:50 
GeneralNice job Pin
Gautam Sharma25-May-08 22:28
Gautam Sharma25-May-08 22:28 
GeneralRe: Nice job Pin
Ravi Bhavnani26-May-08 3:10
professionalRavi Bhavnani26-May-08 3:10 
GeneralWrong Encoding Pin
tarasn4-Nov-07 10:58
tarasn4-Nov-07 10:58 
QuestionUpdated 14 Jun 2007 so what's new? Pin
stavn20-Jun-07 4:33
stavn20-Jun-07 4:33 
AnswerRe: Updated 14 Jun 2007 so what's new? Pin
Ravi Bhavnani20-Jun-07 8:49
professionalRavi Bhavnani20-Jun-07 8:49 
GeneralGreat Work Pin
merlin98114-Jun-07 9:10
professionalmerlin98114-Jun-07 9:10 
GeneralRe: Great Work [modified] Pin
Ravi Bhavnani14-Jun-07 9:14
professionalRavi Bhavnani14-Jun-07 9:14 
GeneralRe: Great Work Pin
merlin98114-Jun-07 11:30
professionalmerlin98114-Jun-07 11:30 
GeneralRe: Great Work Pin
Ravi Bhavnani14-Jun-07 18:20
professionalRavi Bhavnani14-Jun-07 18:20 
GeneralWeb Page Authentication Pin
yachitha14-Jun-07 8:07
yachitha14-Jun-07 8:07 
GeneralWebsite that requires username/password Pin
me@vbman.com8-May-07 17:50
me@vbman.com8-May-07 17:50 
How do you get the contents of a webpage that requires username/password?
Thank you in advance for your time.Smile | :)


~Mann.
GeneralRe: Website that requires username/password Pin
Ravi Bhavnani9-May-07 2:57
professionalRavi Bhavnani9-May-07 2:57 
QuestionStuck with abstract class Pin
AndyTexas16-May-06 4:37
AndyTexas16-May-06 4:37 
AnswerRe: Stuck with abstract class Pin
Ravi Bhavnani16-May-06 4:44
professionalRavi Bhavnani16-May-06 4:44 
GeneralRe: Stuck with abstract class Pin
AndyTexas16-May-06 5:42
AndyTexas16-May-06 5:42 
GeneralRe: Stuck with abstract class Pin
Ravi Bhavnani16-May-06 5:45
professionalRavi Bhavnani16-May-06 5:45 
GeneralGood Job Pin
David Roh26-Apr-06 0:15
David Roh26-Apr-06 0:15 
GeneralRe: Good Job Pin
Ravi Bhavnani26-Apr-06 2:09
professionalRavi Bhavnani26-Apr-06 2:09 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.