Click here to Skip to main content
Click here to Skip to main content

Tagged as

HtmlDocument Introspection in Treeview

, 8 Feb 2009 CPOL
Rate this:
Please Sign up or sign in to vote.
HtmlDocument Introspection in Treeview showing html , form , link ,images and css

HtmlIntrospection

Introduction

After my article XML Introspection and TreeView , I take a look about the webbrowser component. and I discover this component have a property HtmlDocument (webBrowser1.Document). This is a good way to get info of the webpage , without parsing Html, the webbrowser component make it for you.  

Background 

  I want to expose to you here, a little application showing a webpage ( in the screenshoot the codeproject page ) and get information in the HtmlDocument ( tree of HtmlElement ).

Showing theses in a treeview and show in right a preview , and display property in a propertygrid ( right and bottom ). 

Using the code 

Enter an URL in the text entry  and press the Go button.  

When the web page is loaded, then the Event Handler webBrowser1_DocumentCompleted is call.

So we catch all html tag of body , forms , links, Images, and CSS

 For each type there's a method: 
private void FillTree(HtmlElement hElmFather, TreeNodeHtmlElm t,TreeNodeHtmlElm.TypeNode type) 
private void FillTreeForm(HtmlDocument doc, TreeNodeHtmlElm t) { 
            System.Collections.IEnumerator en = doc.Forms.GetEnumerator();
            while (en.MoveNext())
            {
                FillTree((HtmlElement)en.Current,t,TreeNodeHtmlElm.TypeNode.Form);  
            }  
private void FillTreeLink(HtmlDocument doc, TreeNodeHtmlElm t) 
// To find all link : string textToAdd = e.GetAttribute("href"); where e is a HtmlElement
private void FillTreeImage(HtmlDocument doc, TreeNodeHtmlElm t) 
// To find all image : string textToAdd = e.GetAttribute("src");
At each time we use a tempory array to not concider same img or link.
private void FillTreeCss(HtmlDocument doc, TreeNodeHtmlElm t)
For the CSS, the test is : 
     if(e.TagName.ToLower() == "link")
                {
                    if (e.GetAttribute("rel").ToLower() == "stylesheet")
		

So, the information are structured in a treeview, each element of treeview is a class TreeNodeHtmlElm : TreeNode.

Points of Interest 

I found interesting to explore a webpage in this way, a different way to see one.

I have a problem with tree view because the text of the node a too huge, and then the application is really slow when tooltips appear so I limit the size of 100:  

            public TreeNodeHtmlElm(HtmlElement elm,TypeNode t) : base()
            {
                type = t;
                mHtmlElement = elm;
                try
                {
                    if (elm.OuterText == null || elm.OuterText == "")
                    {
                        Text = elm.OuterHtml;
                    }
                    else
                    {
                        if (elm.OuterText.Length > 100)
                        {
                            Text = elm.OuterText.Substring(0, 100);
                        }
                        else
                        {
                            Text = elm.OuterText;
                        }
                    }
                }
                catch (Exception e)
                {
                    Text = "";
                }

If you click on the treenode, the application make a preview a the piece of html, in the windows a the right top. 

You can right click, and the there's a content menu , and you can save ( SaveTreeNodeHtml ) the Text of the subnodes.

It don't work for image , it doesn't save image only url of image, it could be inteesting in another version to download and save the image , the same for the CSS.

Please take a look of my different page

http://www.cmb-soft.com/ a css editor

My homepage http://vidalcharles.free.fr/

I'm looking for a job, if anybody have a job proposition please email me at charles.vidal(at)gmail.com thanks.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

zebulon75018
Software Developer (Senior) http://www.cmb-soft.com/
France France
No Biography provided

Comments and Discussions

 
GeneralIt is a great Job PinmemberCollus27-May-12 21:02 
GeneralMy vote of 5 Pinmembermanoj kumar choubey26-Feb-12 22:12 
GeneralThank! Pinmemberthansautk18-Jun-09 17:54 
GeneralThanks !! PinmemberPaw Jershauge9-Feb-09 3:29 
GeneralRe: Thanks !! Pinmemberzebulon750189-Feb-09 7:09 
GeneralRe: Thanks !! PinmemberPaw Jershauge9-Feb-09 7:49 
GeneralRe: Thanks !! Pinmemberzebulon750189-Feb-09 9:38 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web03 | 2.8.141216.1 | Last Updated 9 Feb 2009
Article Copyright 2009 by zebulon75018
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid