Click here to Skip to main content
12,625,239 members (40,687 online)
Rate this:
 
Please Sign up or sign in to vote.
See more: VB VB.NET
Hello all,

I have a string, it is a page source from a website.
I need a regular expression to get out some news items from the page source.
The website didn't have RSS, so I'm having to do it this way.

I think it'll be something like this:
"(?<=(<div id=""newsItem"">)).*?(?=(</div>))"
But I'm very knew to regular expressions, I've always steered away from them until now.

Can anyone help with this issue please?

Any replies are greatly appreciated,
Tom.
Posted 10-Jan-13 22:48pm
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 1

Get a copy of Expresso[^] and start writing and testing expressions. There's no better time to learn than when you need it!
  Permalink  
Comments
Sergey Alexandrovich Kryukov 11-Jan-13 22:03pm
   
Right advice. And last sentence is just wise. My 5.
—SA
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 2

Hi,

Your expression is correct. Use a Match to get the news from the HTML tags. First, add this at the top of your code file:
Imports System.Text.RegularExpressions
Then, use this code to get the news from the HTML tags:
Dim newsAndHtmlTags As String = "<p><div id=""newsItem"">This is news!</div></p>"
Dim pattern As String = "(?<=(<div id=""newsItem"">)).*?(?=(</div>))"
Dim match As System.Text.RegularExpressions.Match = Regex.Match(newsAndHtmlTags, pattern)
Dim news As String = match.Value
Hope this helps.
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month


Advertise | Privacy | Mobile
Web02 | 2.8.161128.1 | Last Updated 11 Jan 2013
Copyright © CodeProject, 1999-2016
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100