hi ,
i want to extract all the categories and sub categories from this link. anyone please tell me how i can do so , please give me hint
http://www.codeproject.com/script/Content/SiteMap.aspx[
^]
im practising webclient , PLEASE ONLY GIVE ME REGEX FOR IT. all categories and subcategories should be gathered
following is the code i have wrote so far :i want on ly regex to pass on it to gather all data
public ExtractHtml(String url)
{
client = new WebClient();
strm = client.OpenRead(url);
StreamReader strrdr = new StreamReader(strm,Encoding.ASCII);
code = strrdr.ReadToEnd();
}
public List<string> Extract(String regex)
{
lines = new List<string>();
Regex rgx = new Regex(regex, RegexOptions.IgnoreCase);
MatchCollection cl = rgx.Matches(code);
foreach (Match item in cl)
{
lines.Add(item.Value);
}
return lines;
}
i want to extract text from this
<a id="ctl00_MC_TCRp_ctl00_TCNL" href="/Chapters/1/Desktop-Development.aspx">Desktop Development</a>
<li>
<a id="ctl00_MC_TCRp_ctl00_TSRp_ctl01_TSNL" href="/KB/buttons/">Button Controls</a>
</li>
<li>
<a id="ctl00_MC_TCRp_ctl00_TSRp_ctl02_TSNL" href="/KB/clipboard/">Clipboard</a>
</li>
<li>
<a id="ctl00_MC_TCRp_ctl00_TSRp_ctl03_TSNL" href="/KB/combobox/">Combo & List Boxes</a>
</li>
<li>
<a id="ctl00_MC_TCRp_ctl00_TSRp_ctl04_TSNL" href="/KB/dialog/">Dialogs and Windows</a>
</li>
<li>
<a id="ctl00_MC_TCRp_ctl00_TSRp_ctl05_TSNL" href="/KB/gadgets/">Desktop Gadgets</a>
</li>