Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: VB.NET
Hello all,
 
I have a string, it is a page source from a website.
I need a regular expression to get out some news items from the page source.
The website didn't have RSS, so I'm having to do it this way.
 
I think it'll be something like this:
"(?<=(<div id=""newsItem"">)).*?(?=(</div>))"
But I'm very knew to regular expressions, I've always steered away from them until now.
 
Can anyone help with this issue please?
 
Any replies are greatly appreciated,
Tom.
Posted 10-Jan-13 21:48pm
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

Get a copy of Expresso[^] and start writing and testing expressions. There's no better time to learn than when you need it!
  Permalink  
Comments
Sergey Alexandrovich Kryukov at 11-Jan-13 22:03pm
   
Right advice. And last sentence is just wise. My 5.
—SA
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Hi,
 
Your expression is correct. Use a Match to get the news from the HTML tags. First, add this at the top of your code file:
Imports System.Text.RegularExpressions
Then, use this code to get the news from the HTML tags:
Dim newsAndHtmlTags As String = "<p><div id=""newsItem"">This is news!</div></p>"
Dim pattern As String = "(?<=(<div id=""newsItem"">)).*?(?=(</div>))"
Dim match As System.Text.RegularExpressions.Match = Regex.Match(newsAndHtmlTags, pattern)
Dim news As String = match.Value
Hope this helps.
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



Advertise | Privacy | Mobile
Web04 | 2.8.140709.1 | Last Updated 11 Jan 2013
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid