Click here to Skip to main content
15,881,803 members
Please Sign up or sign in to vote.
4.67/5 (2 votes)
See more:
Hello,

Please help me with code using htmlagilitypack to select all input element (including select, textarea etc), on a form, extracting the input element name and type

VB
Dim htmldoc As HtmlDocument = New HtmlDocument()
        htmldoc.LoadHtml(txtHtml.Text)
        Dim root As HtmlNode = htmldoc.DocumentNode
        If root Is Nothing Then
            tsslStatus.Text = "Error parsing html"
        End If
        ' parse the page content
        For Each InputTag As HtmlNode In root.SelectNodes("//input")
            'get title
            Dim attName As String = Nothing
            Dim attType As String = Nothing
            For Each att As HtmlAttribute In InputTag.Attributes
                Select Case att.Name.ToLower
                    Case "name"
                        attName = att.Value
                        'get href or link
                    Case "type"
                        attType = att.Value
                End Select
                If attName Is Nothing OrElse attType Is Nothing Then
                    Continue For
                End If
                Dim sResult As String = String.Format("Type={0},Name={1}", attType, attName).ToLower

                If txtResult.Text.Contains(sResult) = False Then
                    'Debug.Print(sResult)
                    txtResult.Text &= sResult & vbCrLf
                End If

            Next
        Next


thanks
Posted
Updated 24-Jun-11 0:41am
v3

1 solution

C#
HtmlDocument doc = new HtmlDocument();
doc.Load("SomePathToAHTMLDocumentHere");
HtmlNode docNode = doc.DocumentNode;
HtmlNodeCollection nodes = docNode.SelectNodes("//input"); //SelectNodes takes a XPath expression
foreach(HtmlNode node in nodes)
{
    String id   = node.GetAttributeValue("id");    // Fetch id of HTML element
    String name = node.GetAttributeValue("name");  // Fetch parameter name (GET/POST)
    String type = node.GetAttributeValue("type");  // Fetch type of input element
    // Do your processing now
}


Go to the download site and fetch the CHM file. That will help you.

Cheers!

--MRB
 
Share this answer
 
v2
Comments
Cool Smith 24-Jun-11 4:55am    
hello, thanks, but this only get the input (text and check box) what of combo (select tag) textarea, list, checkbox etc
BobJanova 24-Jun-11 5:56am    
You said you wanted 'input elements' i.e. things represented by <input>. That should include check boxes, text entry, buttons and radio buttons. To get <textarea> or, <select> you will have to call SelectNodes again with the relevant element type.
Manfred Rudolf Bihy 24-Jun-11 6:12am    
Use docNode.SelectNodes("//select") or "//textarea" etc. etc.

Besides you should consider voting on my solution and accepting it as the answer.
Cool Smith 24-Jun-11 6:36am    
is there anyway i can include an OR or AND in the xpath query?, somthing like

docNode.SelectNodes("//select" OR //textarea OR //input"

thanks
Manfred Rudolf Bihy 24-Jun-11 7:49am    
Use the | character to seperate the expressions: "//input | //select | //textarea".

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900