Click here to Skip to main content
13,353,462 members (42,561 online)
Rate this:
Please Sign up or sign in to vote.
See more:
Hello all,

I have a string, it is a page source from a website.
I need a regular expression to get out some news items from the page source.
The website didn't have RSS, so I'm having to do it this way.

I think it'll be something like this:
"(?<=(<div id=""newsItem"">)).*?(?=(</div>))"

But I'm very knew to regular expressions, I've always steered away from them until now.

Can anyone help with this issue please?

Any replies are greatly appreciated,
Posted 10-Jan-13 22:48pm
Rate this: bad
Please Sign up or sign in to vote.

Solution 1

Get a copy of Expresso[^] and start writing and testing expressions. There's no better time to learn than when you need it!
Sergey Alexandrovich Kryukov 11-Jan-13 22:03pm
Right advice. And last sentence is just wise. My 5.
Rate this: bad
Please Sign up or sign in to vote.

Solution 2


Your expression is correct. Use a Match to get the news from the HTML tags. First, add this at the top of your code file:
Imports System.Text.RegularExpressions

Then, use this code to get the news from the HTML tags:
Dim newsAndHtmlTags As String = "<p><div id=""newsItem"">This is news!</div></p>"
Dim pattern As String = "(?<=(<div id=""newsItem"">)).*?(?=(</div>))"
Dim match As System.Text.RegularExpressions.Match = Regex.Match(newsAndHtmlTags, pattern)
Dim news As String = match.Value

Hope this helps.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month

Advertise | Privacy |
Web02 | 2.8.180111.1 | Last Updated 11 Jan 2013
Copyright © CodeProject, 1999-2018
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100