Click here to Skip to main content
15,075,944 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more: , +
Hi,

I am trying to remove HTML tags from an RSS Feed that I am downloading and adding the items from the RSS feed into a listBox.

I know that this can be used to remove HTML tags, but I am note sure how I would implement it into my code.
C#
public string Strip(string text) 
{
     return Regex.Replace(text, @"<(.|\n)*?>", string.Empty);
}

Here is my code, I am trying to remove the HTML tags from the description element.
C#
XElement xmlScan = XElement.Parse(e.Result);

listBox1.ItemsSource = from channel in xmlScan.Descendants("item")
                       select new ScanItem
                       {
                         title = channel.Element "title").Value,
                         description = "Position: " + channel.Element("description").Value
                       };
Posted
Updated 13-Feb-11 6:04am
v2

Well, if you are confident of the Strip method, just replace this:

C#
description = "Position: " + channel.Element("description").Value


with this:

C#
description = "Position: " + Strip(channel.Element("description").Value)
   
v2
Comments
Sergey Alexandrovich Kryukov 13-Feb-11 12:20pm
   
Nishant, frankly, this is too much of ad-hoc, XML is XML, data binding is data binding, why going into peculiarities, especially based on hard-coded immediate string constants?
--SA
Nish Nishant 13-Feb-11 12:23pm
   
I believe the OP's trying to populate his data object that he binds to his ListBox. And he does not want to directly use the raw data (because of the html tags). I believe he only wants to filter the Description though, which indicates that for the actual rss body, he may be using an HTML capable control.
Sergey Alexandrovich Kryukov 13-Feb-11 12:26pm
   
I answered how to do this in a regular way.
--SA
Nish Nishant 13-Feb-11 12:28pm
   
SA, based on the OP's code, the ListBox is not for any hierarchical data. The ListBox is merely to show the RSS entry descriptions. Example, just the title posts in a blog.
Sergey Alexandrovich Kryukov 13-Feb-11 14:38pm
   
Agree, in my answer I did not deny this possibility.
--SA
Manfred Rudolf Bihy 13-Feb-11 12:23pm
   
Nishant enforces a prerequisite so the anser is OK. It's for OP to decide if OP trusts his Strip method.
Nish Nishant 13-Feb-11 12:26pm
   
Yes, and honestly I was surprised that the OP did not know how to call the Strip method. It indicates that perhaps, just perhaps, the code he's using is not his own. Otherwise I find it extremely odd that he could not have figured this out on his own, when he seems to have had no trouble in coming up with a fairly good Linq query.
Manfred Rudolf Bihy 13-Feb-11 12:50pm
   
Most likely :)
First, RSS is based on XML, not HTML. It's best to parse the whole feed instead of removing the tags.

Your attempt to make a data source for a list box out of RSS is clashed with the fact that list box lacks hierarchical structure. So, you first need to decide for yourself how you want to map the tree-like hierarchical structure onto linear list box structure. (I would suggest you do something else, like making RSS a source for more adequate TreeView.)

In all cases, you should parse whole RSS field starting from the top element.

If you still want to use ListBox and want to map just the elements of one level or only of one kind, one option is to use System.Xml.XmlReader, which is the fastest method of parsing, especially good if you want to ignore a lot of data.

Alternatively, you can stay with System.Xml.Linq methods of handling XML, you, again, should parse whole RSS starting with System.Xml.Linq.XDocument, XElement. For a record, you can always use DOM-based parsing based on System.Xml.XmlDocument; I would say, probably least recommended for your purposes.

—SA
   
v2
Comments
Manfred Rudolf Bihy 13-Feb-11 12:25pm
   
Better yet! 5+

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900