Not only is this very good advice, but also an excellent read. Various reasons to not use RegEx on HTML.[^]
Top NewsInsider News free each morning.
Get page HTML from URL using WebClient, Strip HTML using Regex , export a list of Anchors into Excel or XML.