Click here to Skip to main content
Click here to Skip to main content

RSS Reader

By , 17 Feb 2004
 

Sample Image - RssReader.gif


Introduction

There's already quite a comprehensive RSS tool on The Code Project: RSS 2.0 Framework written by "Jerry Maguire". The purpose of this, RssReader class, is to provide a simple tool for retrieving RSS feeds from remote and local sources, without needing to parse XML in each application you require the RSS feed in. The class, as its name suggests, only reads RSS feeds - it has no capabilities for writing feeds.

The RSS Format

The RSS (Really Simple Syndication) specification is found at http://blogs.law.harvard.edu/tech/rss. It's an XML format for retrieving, typically, headlines or the latest article details from other sites. For example the New York Times provide an RSS feed of their main headlines, which you can access and put inside your own site or application.

The RSS format is true to its name - simple. As the image below shows, it contains a root RSS node, which has a channel node beneath it. Inside this channel node there are a number of elements to describe the feed. Then after these is a list of articles, headlines, stories or whatever they contain in the form of item nodes.  Each item node contains elements to describe themselves - title, description and link are the 3 required elements, there are other optional ones which you can be read about in the specification and the RssReader class docs.

Typical XML format of an RSS document

RssFeed and RssItem objects

Given the simpilicity of the RSS format, it was straightforward to map its structure to a value type (struct).The image below shows the RssFeed object.

RssFeed object
 

In this are most (some haven't been implemented in this version) of the fields that RSS offers. There is an Items property, which contains a collection of RssItem objects. The RssItem type maps to an RSS item, containing most of the fields available to an RssItem.

RssItem object

The RssReader class has one main method, RetrieveFeed which returns an RssFeed object, given a url. This url can be in the format of file:// as well as the standard http://, if you want to open a local file (it's not been tried with ftp://).

Static methods

I added several static methods to make the process even simpler, they're all self explanatory. Also included in the class library is a class called RssHtmlMaker. This is a simple tool I wrote to turn a RssFeed object into a html (or any other format) document, given a template containing tokens. These tokensmap to the rss fields available. Details of the tokens are in the documentation.

RDF

Included in the RssReader is the ability to read simple RDF format feeds. RDF (Resource Description Framework) is an W3C XML format for describing web resources. The format caters for describing the content of the web resource, including items such as title, description and url. There were a couple of RDF feeds which i wanted to use, specifically the slashdot.org feed and the register.com feed. The main way these RDF documents differ
from their RSS counter-parts is the the <items> nodes are children of the main root rss (or in this case rdf) node, rather than the <channel> node.. This is catered for in the RssReader class via the member variable RdfMode, which is set to false by default. The RDF specification is found at http://www.w3.org/TR/2004/REC-rdf-primer-20040210/ .

Final note

One final thing to note: to save yourself at worst being ip-banned by the hosts of the feeds, and at best just upsetting the feed providers, I'd recommend caching the feeds once you get them, rather than retrieving the feed each time it's required. This can be done by serializing the RssFeed class, or creating a html version of the feed and saving it to disk.

Hopefully the class is useful to people - I've not managed to find any C# Rss Feeds about, if there are any then leave a url below.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

yetanotherchris
Web Developer
United Kingdom United Kingdom
Member
London based C# programmer.
 
I maintain my own pet C# site http://www.sloppycode.net in my spare time.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralClass doesn't allow retrieving CDATA content for the &lt;description&gt; tagmembercraigg7529 Sep '09 - 15:00 
Anybody have a code snippet on how to make that happen? Seems like that's a big part of any rss reader no matter how simple.
QuestionLicense for commercial use ?membermichal.ziomek4 May '09 - 0:45 
Hi.
What are the terms of use ( License) of this particular code for commercial products?
 
Best Regards
Mike
QuestionHow to read old feedsmembersatalaj2 Mar '09 - 18:43 
Hi.
How can I read old feeds using C#? Like google reader reads all old new feeds while scrolling down and up.
 
Satalaj
QuestionLicense?memberMember 424890831 Jan '09 - 22:34 
Is this library available for free use in a non-commercial project?
 
Thanks
GeneralpubDate-BugmemberMember 29040908 Dec '08 - 23:38 
in the switch/case of the getRssItem-Method you compare against the "pubdate" string but according to the standard it should be "pubDate". Great work though
 
Graziano
GeneralRe: pubDate-Bugmemberalcobnitech5 Nov '09 - 19:40 
You are right. To fix this, go to line number 552, and change "pubdate" to "pubDate".
 
I also changed reading of pubDate so that RssItem.PubDate is a DateTime variable, and the date gets parsed: rssItem.Pubdate = DateTime.Parse(xmlNode.ChildNodes[i].InnerText.Remove(xmlNode.ChildNodes[i].InnerText.IndexOf(" +")));
 
That way you can get out the date in any format you wish, for instance:
string date = String.Format("{0:dd.MM.yyyy}", feed.Items[i].PubDate)
GeneralRe: pubDate-Bug [modified]memberCptHook9 Nov '10 - 22:37 
Hi all,
I found another problem with this date (both for the rssfeed.pubdate and the rssitem.pubdate): in case of date in this format "Wed, 10 Nov 2010 03:20:28 EST" the conversion failes and so I used this code lines:
 
rssFeed.PubDate = DateTime.Parse(channelXmlNode.ChildNodes[i].InnerText.Substring(0, 25));
 
and
 
rssItem.PubDate = DateTime.Parse(xmlNode.ChildNodes[i].InnerText.Substring(0, 25));
 

I also changed the rssFeed.LastBuildDate in DateTime
public DateTime LastBuildDate;
 
and so:
 
rssFeed.LastBuildDate = DateTime.Parse(channelXmlNode.ChildNodes[i].InnerText.Substring(0, 25));
 

I hope this is usefull!
 

CptHook
modified on Wednesday, November 10, 2010 4:46 AM

Generalnicememberali_reza_zareian10 Jul '08 - 22:09 
its a great article.
GeneralHTML in the Descriptionmemberfodaley4 Jan '08 - 22:19 
Hello, thanks for explaining your code. Do you have any suggestions on how to process HTML code that's in the description node correctly? i.e. I have text between anchors, and also youtube videos in my blog, but it's not processing them in the feed reader and everything is just text. As a matter of fact, I don't even see the code for youtube videos. Thanks!
GeneralRe: HTML in the Descriptionmemberfodaley4 Jan '08 - 23:09 
btw for example, sometimes i get \n\n\n in the description when there should be a lot of text, and youtube video...also...all the descriptions are truncated with [...]...i'm trying to reproduce what you would actually see when you go to the blog. I'm testing with a wordpress feed, which uses cdata in the description, which from what I read, should protect the Html, correct?
GeneralRe: HTML in the Descriptionmemberfodaley4 Jan '08 - 23:09 
btw for example, sometimes i get \n\n\n in the description when there should be a lot of text, and youtube video...also...all the descriptions are truncated with [...]...i'm trying to reproduce what you would actually see when you go to the blog. I'm testing with a wordpress feed, which uses cdata in the description, which from what I read, should protect the Html, correct?
GeneralSmall bugmemberehsoiueylkfjsegfoieuygfrlkeajrhg15 May '07 - 7:35 
Hi, first of all, thanks for writing this article, it was a great help.
I'm extensively using your RSS reader class.
I would just like to mention a small bug here (as your website is down) :
You just forgot to match against the "link" markup, which is why the RssFeed.Link property is always an empty string.
 
Hope this helps.
Thanks again.
 
Mike
KameHouse Prod.
QuestionWhat would I need to do if I wanted to display feeds from more than one source at the same timememberbijhere16 Feb '07 - 5:09 
Wonderful piece of code. Small/Easy to understand/Easy to implement/ Easy to modify.
 
I want to do something like the Google Reader where you can choose more than 1 source feeds and display them all together.
 
Could I somehow extend this code to do that?
 
Thanks,
Bij
Generalreading only new feedsmembernitstheone20 Nov '06 - 8:32 
how can i read only new feeds... or do i have to read, then compare with what I have and then either skip it or consume it.
 
How do other RSS readers work?

GeneralIssues with Special characters (á,é,í,ó,ú,ñ,etc)membersesmac10 Nov '06 - 12:46 
Hi,
I've been looking for a RSS reader and I just find this one which is exactly what I was looking for it, but I have a problem with some characters.
 
For instance, if I get a mexican RRS channel: (http://mx.news.yahoo.com/rss/entretenimiento)
 
I get the following news:
AP - NUEVA YORK (AP) - Freddy Rodríguez, el embalsamador de la exitosa serie televisiva "Six Feet Under", no tuvo escarbar demasiado hondo al prepararse para su papel de Mike Alonso en la película "Harsh Times", que protagoniza junto a Christian Bale y Eva Longoria.
 
And it should be like this:
AP - NUEVA YORK (AP) - Freddy Rodríguez, el embalsamador de la exitosa serie televisiva "Six Feet Under", no tuvo escarbar demasiado hondo al prepararse para su papel de Mike Alonso en la película "Harsh Times", que protagoniza junto a Christian Bale y Eva Longoria.
 
but the special characters are not supported in the case for Spanish Language, do you have any recomendation to fix it?
 
The source code is very good, in fact is exactly what I was looking for it.
 
I will appreciate your feedback.
 
Thanks!!!!
 
Squalo
GeneralRe: Issues with Special characters (á,é,í,ó,ú,ñ,etc)membersmallguy7815 Nov '06 - 6:11 
Can you change the template so it has
 
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
 
at the top (you'll need to edit the text in the richtextbox on the app). I've updated this in a new version of the app I'll be releasing in the next week or so, which also contains changes that people have suggested here.
GeneralThanks!memberjamesjin26 Aug '06 - 7:42 
Just to say thanks - your RSS code works perfectly for my needs, with no effort. Superb little class! Big Grin | :-D
GeneralAdding Enclosure support (here's how)memberhejndorf2 Feb '06 - 22:33 
I needed PodCast Enclosure support - simple with this great code:
 
In function getRssItem add
 

case "enclosure":
{
rssItem.Enclosure.Url = xmlNode.ChildNodes[i].Attributes["url"].InnerText;
rssItem.Enclosure.Length = long.Parse(xmlNode.ChildNodes[i].Attributes["length"].InnerText);
rssItem.Enclosure.Type = xmlNode.ChildNodes[i].Attributes["type"].InnerText;
break;
}

 
Then add a new struct:

[Serializable()]
public struct RssEnclosure
{
///
/// The URL of the item.
///

public string Url;
///
///

public long Length;
///
///

public string Type;
}

 
And finally a reference to it in the RssItem struct:
 

public RssEnclosure Enclosure;


GeneralProblem making work with .NET 2.0memberbegin_busa27 Sep '05 - 0:16 
I am trying to build and run the code for .NET 2.0.
 
The main challenge I am facing is about accesing controls on different thread. Had to add delegate methods for any such requirement.
 
Also having problem with `Proxy Authentication Required`. Seems to be an issue with DTD\Scehma document. Need to have a copy of the document on local drive. But still not working correctly.
 
I have just spend couple of hours on this so my comments may be a little shallow.
 
Cheers
 
BEGIN
GeneralRe: Problem making work with .NET 2.0membersmarkley23 Nov '05 - 9:49 
This seems to be something different in .net 2.0...Confused | :confused:
 
from help:
CheckForIllegalCrossThreadCalls
Property Value
true if calls on the wrong thread are caught; otherwise, false.
Remarks
When a thread other than the creating thread of a control tries to access one of that control's methods or properties, it often leads to unpredictable results. A common invalid thread activity is a call on the wrong thread that accesses the control's Handle property. Set CheckForIllegalCrossThreadCalls to true to find and diagnose this thread activity more easily.

 
So to get it to work, I put a CheckForIllegalCrossThreadCalls = false; in the readRss() method... probably a no no, but it worked... I guess I will be using delegates more often in the future.
 
smarkley

GeneralRe: Problem making work with .NET 2.0memberVasche28 Feb '06 - 5:28 
More info on how to solve the issue using delegates, please?
GeneralRe: Problem making work with .NET 2.0memberVasche28 Feb '06 - 6:12 
Ok, got it:
 
...
public delegate string GetComboBoxText();
 
GetComboBoxText _getComboBoxText;
 
public string MyGetComboBox()
{
return this.comboBox1.Text;
}

...
private void readRss()
{
...
 
RssFeed feed;
if (_getComboBoxText == null)
_getComboBoxText = new GetComboBoxText(MyGetComboBox);
 
string url;
if (this.InvokeRequired)
{
url = (string)this.Invoke(_getComboBoxText);
}
else
{
url = _getComboBoxText();
}
 
feed = rssReader.Retrieve(url);
...
 
Is that the right way?
GeneralRe: Problem making work with .NET 2.0memberk^s6 Oct '06 - 2:15 
As I know, this way is the "simple" (¿?) way, but microsoft recommends to use his BackGroundWorker object.
 
You can see more about this from MSDN.
GeneralRe: Problem making work with .NET 2.0membersmallguy7815 Nov '06 - 3:55 
if ( this.InvokeRequired )
{
this.Invoke( new MethodInvoker( readRss ) );
return;
}

 
put that at the top of private void readRss()
 
I'll update the project files at some point, along with the proxy server additions.
GeneralWhy didn't you add RDF auto detectsusswiseleyb2 Aug '05 - 0:53 
Great code, very useful! I was just wondering why you didn't add a simple:
 
XmlNodeList xnl = xmlDoc.SelectNodes("//rss");
System.Diagnostics.Debug.WriteLine("XNL: " + xnl.Count);
this.RdfMode = (xnl.Count == 0);
 
to the Retrieve(URL) function?
 
-ben
wiseleyb@gmail.com
GeneralCopy/Paste bugmembermav.northwind17 Jan '05 - 5:08 
Hi!
 
I just stumbled across a little copy/paste error in your code in RssHtmlMaker:
In GetHtmlContents() you're replacing %Skipdays% where you should replace %ManagingEditor%.
 
Apart from that: Nice work!
 
Regards,
mav
GeneralGreat RSS articlememberPaKettle2 Oct '04 - 4:04 
Thank you for the great article, very helpful, very well written code. The only thing I had a problem with is the "link" element at the Channel level was missing. I just added it to the switch statement and it runs fine now.
QuestionHow to I get more then one item from the rss feedmemberbergetun29 Jun '04 - 4:00 
I want to get ALL the descriptions into a string array for example.
 
The way i have the code now is simple and it only get the first description from the xml/rdf document.
 
(I just started programming C# as you probably can see) Smile | :)
 
This is how i have it now
string link;
ssReader rssReader = new RssReader();
rssReader.RdfMode = false;
RssFeed feed = rssReader.Retrieve("xmldocument");
label2.Text = link = feed.Description;
 
As you can see, this only print the first Description on in the xml document, but i want all of them Smile | :)
Thank you very much for your help Smile | :)
 

AnswerRe: How to I get more then one item from the rss feedmemberRodrigo Dias10 May '05 - 6:18 
Use the Items property. It's a collection and you can do a loop to get all the items.
GeneralRe: How to I get more then one item from the rss feedmemberelie aintabi10 Jul '08 - 5:52 
how?
AnswerRe: How to I get more then one item from the rss feedmemberAntonio Dias10 Jul '08 - 6:21 
In NET 2.0 ( because of the generic List you can do like this:
 

List<string> descriptions = new List<string>();
foreach (RssItem item in feed.Items)
{
   descriptions.Add(item.Description)
}
</string></string>
 
This will get the Description of All the items in the feed. If you need an Array just do:
descriptions.ToArray();
 
That's it.

General(407) Proxy Authentication Required.memberhex184829 Mar '04 - 10:25 
An unhandled exception of type 'System.Net.WebException' occurred in system.xml.dll
 
Additional information: The remote server returned an error: (407) Proxy Authentication Required.
 
I get the above error when trying to read the RSS feed from my internal network. Is there an easy way to configure a proxy connection in the code that gets the XML?
GeneralRe: (407) Proxy Authentication Required.membersmallguy787 Apr '04 - 22:46 
That looks like a problem with the Xml classes, maybe check msdn for using proxy servers.
GeneralRe: (407) Proxy Authentication Required.memberbegin_busa27 Sep '05 - 0:57 
The problem is that the XmlReader must be internally using the 'WebRequest' class. As in many companies, all request have through go through the proxy. So this need to conifugured in the application. I added a static constructor to the RssReader class to initialise the proxy.
 
static RssReader()
{
WebProxy proxyObject = new WebProxy("http://PROXYADDRESS:80/");
proxyObject.UseDefaultCredentials = true;
WebRequest.DefaultWebProxy = proxyObject;
}
 
Hope this helps.
 
BEGIN
AnswerRe: (407) Proxy Authentication Required.membersnort11 Sep '07 - 2:14 
u can also use this code in the constructor (works with framework 1.1):
 
WebProxy proxyObject = WebProxy.GetDefaultProxy();
proxyObject.Credentials = CredentialCache.DefaultCredentials;
GlobalProxySelection.Select = proxyObject;

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web03 | 2.6.130516.1 | Last Updated 18 Feb 2004
Article Copyright 2004 by yetanotherchris
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid