Click here to Skip to main content
15,885,141 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hi, need help for a regular expression, cant really figure ir out myself.
I have a website, for example http://www.website.com and I use this code to read the content of the website.

StreamReader webSource = new StreamReader(webResponse.GetResponseStream());

                   string source = webSource.ReadToEnd();

So I need a regular expression match to find all the containig website url`s that are within the server. So I need to find all theese kind of links:

http://www.website.com/search/84f2fbfcf85129866221a71b7d48f2da/?sCat=124");
http://www.website.com/search/7569ac370abc2aa02cd3e0760c418cc9/?sCat=38");
http://www.website.com/show/?i=2368173&popup=1mp;search=bcc6928a29fe348a30cbfc2dc1aba4ab&place=1");

And i dont want to find links like
http://www.OTHERwebsite.com/search/84f2fbfcf85129866221a71b7d48f2da/?sCat=124&
Can anyone help with this?
Posted
Comments
Zoltán Zörgő 26-Dec-12 15:21pm    
- "url`s that are within the server" makes no sense, since you fetch only a single page, without references.
- you havce specified only absolute urls. You don't need the relative ones?

I get a syntax error somehow

C#
foreach (Match n in Regex.Matches(source, @(["'])(http://www.website.com/.*?)\1"))
                    {


?
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 6-Nov-13 12:10pm    
Please don't post non-answers as "Solution", it is considered as abuse. You can only get down-votes and abuse reports. Use comments or "Improve question".
—SA
This will find what you have specified:
(["'])(http://www.website.com/.*?)\1
More exactly it will search for it between matching single or double quotes. Group 2 will contain the url.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900