A class for getting the RSS feed list of a website






3.33/5 (6 votes)
A very simple class for listing the RSS feed from a website.
Introduction
This is a very simple class for getting the RSS feed list of a website. Just parse the website URL in the constructor and you are off.....
Example:
FeedListOfWebsite MyFeedList = new FeedListOfWebsite(
new Uri("http://www.codeproject.com"));
if (MyFeedList.Success)
{
foreach (FeedDetail fd in MyFeedList.FeedDetails)
{
//Some code handle here....
//fd.Name
//fd.Url
}
}
The code
Here is the class constructor:
public FeedListOfWebsite(Uri WebsiteUrl)
//Constructor where you have to parse the Url to look for feed.
{
Regex RegX = new Regex("<link.*type=\"application " +
"feedsonwebsite="RegX.Matches(GetHtml(WebsiteUrl));" /> 0)
//If we find some link tags then...
{
//Set FeedDetail Array
FeedDetails = new FeedDetail[FeedsOnWebsite.Count];
int fdi = 0; // Array index count up value
foreach (Match Feed in FeedsOnWebsite)
//Loop through the link tags
{
//Extract data from the html line
FeedDetails[fdi] = ExtractFeed(Feed.Value.ToString());
fdi++; //Count Array index 1 up
}
_success = true;
}
}
Here are the private functions of the class:
private string GetHtml(Uri UriPath)
//a Function for getting HTML code of a website.
{
try
{
//Create a Response
HttpWebResponse Hwr = (HttpWebResponse)WebRequest.Create(UriPath).GetResponse();
Stream Hwrstrm = Hwr.GetResponseStream(); //Get streamet data
StreamReader HwrSr = new StreamReader(Hwrstrm); //Create a streamreader
string strHTML = HwrSr.ReadToEnd(); //Read all data from website.
HwrSr.Dispose(); //Dispose object
Hwrstrm.Dispose(); //Dispose object
Hwr.Close(); //Close object
return strHTML; //Return HTML code of website
}
catch
{
return ""; //Return empty string apon error.
}
}
private FeedDetail ExtractFeed(string HtmlLine)
//a Function for extracting feed data from a HTML code
{
string name = "";
string url = "";
#region Find The Title
try
{
Match Title = Regex.Match(HtmlLine, "(?<=title=).*");
if (Title.Success)
{
int EndOfTitle = Title.Value.ToString().IndexOf("\"", 1);
if (EndOfTitle == -1)
{ EndOfTitle = Title.Value.ToString().IndexOf("'", 1); }
name = Title.Value.ToString().Substring(0, EndOfTitle);
name = name.Replace("\"", "").Replace("'", "");
}
}
catch { name = "[Error finding name...]"; }
#endregion
#region Find the url
try
{
Match Url = Regex.Match(HtmlLine, "(?<=href=).*");
if (Url.Success)
{
int EndOfHref = Url.Value.ToString().IndexOf(" ", 1);
if (EndOfHref == -1) { EndOfHref = Url.Value.ToString().IndexOf("\"", 1); }
if (EndOfHref == -1) { EndOfHref = Url.Value.ToString().IndexOf("'", 1); }
url = Url.Value.ToString().Substring(0, EndOfHref);
url = url.Replace("\"", "").Replace("'", "");
}
}
catch { url = ""; }
#endregion
return new FeedDetail(new Uri(url), name);
}
History
- Updated the code after Marc Jacobi pointed out some basic stuff.
- I have added one more class, just to illustrate working with RSS feeds.
RssFeed
RssFeedEntries
(sub class ofRssFeed
, holds all entries from the feed)RssFeedEntry
(sub class ofRssFeedEntries
, holds data for each entry in the feed)
The classes added are the following:
I didn't have the time to write the description for these classes. I am sorry... so it is posted as is.