Click here to Skip to main content
15,878,543 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
i have a small problem, how do i find all links from a webpage navigated in a web browser control starting with : "".

eg : in a form i have a control which navigates to This page has some links starting with above string but randomly distributed over the page.

WebBrowser1.Document.Links lists all the links of a webpage but how to proceed in my case??

This is where you can start using regular expressions.
You can try something like this:
string mySearchPattern = @"$http://(\w+\.?)thedomain\.com^";
if(Regex.IsMatch(inputString, mySearchPattern))
    //do something

What this regex (i hope) tells the regex engine is:
Does the input string start with http://, optionally contains one or more word characters followed by a dot and ends in
I'm saying i hope this is what it does because i don't have a lot of experience with regular expressions.
But even if this is not an exact solution, if you manage to tame even the basic capabilities of regular expressions, you'll never look back at string searches and replacements.
Try this link for a complete overview of using regular expressions:

Hope this helps,
Share this answer
amit_upadhyay 18-Jul-10 12:46pm    
thanks works perfectly
There are quite a few ways to get the links.
You might use a Regular expression to search in DocumentText.

You can loop through all the controls using Document.Forms and get only the anchor tag.

You can use GetElementById to do the same.

Either way you can enumerate the links in your document.
Share this answer
amit_upadhyay 18-Jul-10 12:19pm    
i can get all the links but how to find the ones starting with a particular pattern?? it is more of a string problem. Find a substring starting with given pattern
Andrei Scripniciuc 18-Jul-10 12:24pm    
Reason for my vote of 4
Lists a lot of good solutions but no sample code.
For checking which strings start with a particular substring just use:
    //do something
Share this answer
amit_upadhyay 18-Jul-10 12:27pm    
can you modify it a little to get this :
suppose i know "" from "http:/" (i dont know "my."), then how can i get the unknown part, i suppose you remove first string from second but how?

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900