Click here to Skip to main content
11,790,021 members (51,261 online)
Rate this: bad
Please Sign up or sign in to vote.
See more: VB.NET
Hello all,

I have a string, it is a page source from a website.
I need a regular expression to get out some news items from the page source.
The website didn't have RSS, so I'm having to do it this way.

I think it'll be something like this:
"(?<=(<div id=""newsItem"">)).*?(?=(</div>))"
But I'm very knew to regular expressions, I've always steered away from them until now.

Can anyone help with this issue please?

Any replies are greatly appreciated,
Posted 10-Jan-13 21:48pm
Rate this: bad
Please Sign up or sign in to vote.

Solution 1

Get a copy of Expresso[^] and start writing and testing expressions. There's no better time to learn than when you need it!
Sergey Alexandrovich Kryukov at 11-Jan-13 22:03pm
Right advice. And last sentence is just wise. My 5.
Rate this: bad
Please Sign up or sign in to vote.

Solution 2


Your expression is correct. Use a Match to get the news from the HTML tags. First, add this at the top of your code file:
Imports System.Text.RegularExpressions
Then, use this code to get the news from the HTML tags:
Dim newsAndHtmlTags As String = "<p><div id=""newsItem"">This is news!</div></p>"
Dim pattern As String = "(?<=(<div id=""newsItem"">)).*?(?=(</div>))"
Dim match As System.Text.RegularExpressions.Match = Regex.Match(newsAndHtmlTags, pattern)
Dim news As String = match.Value
Hope this helps.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 1,029
1 Maciej Los 790
2 KrunalRohit 686
3 CPallini 606
4 Richard MacCutchan 460

Advertise | Privacy | Mobile
Web04 | 2.8.1509028.1 | Last Updated 11 Jan 2013
Copyright © CodeProject, 1999-2015
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100