Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C# VB.NET WinForm
Hello,
 
Please help me with code using htmlagilitypack to select all input element (including select, textarea etc), on a form, extracting the input element name and type
 
Dim htmldoc As HtmlDocument = New HtmlDocument()
        htmldoc.LoadHtml(txtHtml.Text)
        Dim root As HtmlNode = htmldoc.DocumentNode
        If root Is Nothing Then
            tsslStatus.Text = "Error parsing html"
        End If
        ' parse the page content
        For Each InputTag As HtmlNode In root.SelectNodes("//input")
            'get title
            Dim attName As String = Nothing
            Dim attType As String = Nothing
            For Each att As HtmlAttribute In InputTag.Attributes
                Select Case att.Name.ToLower
                    Case "name"
                        attName = att.Value
                        'get href or link
                    Case "type"
                        attType = att.Value
                End Select
                If attName Is Nothing OrElse attType Is Nothing Then
                    Continue For
                End If
                Dim sResult As String = String.Format("Type={0},Name={1}", attType, attName).ToLower
 
                If txtResult.Text.Contains(sResult) = False Then
                    'Debug.Print(sResult)
                    txtResult.Text &= sResult & vbCrLf
                End If
 
            Next
        Next
 
thanks
Posted 23-Jun-11 9:44am
Edited 24-Jun-11 0:41am
v3

1 solution

Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

    HtmlDocument doc = new HtmlDocument();
    doc.Load("SomePathToAHTMLDocumentHere");
    HtmlNode docNode = doc.DocumentNode;
    HtmlNodeCollection nodes = docNode.SelectNodes("//input"); //SelectNodes takes a XPath expression
    foreach(HtmlNode node in nodes)
    {
        String id   = node.GetAttributeValue("id");    // Fetch id of HTML element
        String name = node.GetAttributeValue("name");  // Fetch parameter name (GET/POST)
        String type = node.GetAttributeValue("type");  // Fetch type of input element
        // Do your processing now
    }
 
Go to the download site and fetch the CHM file. That will help you.
 
Cheers!
 
--MRB
  Permalink  
v2
Comments
Cool Smith at 24-Jun-11 4:55am
   
hello, thanks, but this only get the input (text and check box) what of combo (select tag) textarea, list, checkbox etc
BobJanova at 24-Jun-11 5:56am
   
You said you wanted 'input elements' i.e. things represented by <input>. That should include check boxes, text entry, buttons and radio buttons. To get <textarea> or, <select> you will have to call SelectNodes again with the relevant element type.
Manfred R. Bihy at 24-Jun-11 6:12am
   
Use docNode.SelectNodes("//select") or "//textarea" etc. etc.
 
Besides you should consider voting on my solution and accepting it as the answer.
Cool Smith at 24-Jun-11 6:36am
   
is there anyway i can include an OR or AND in the xpath query?, somthing like
 
docNode.SelectNodes("//select" OR //textarea OR //input"
 
thanks
Manfred R. Bihy at 24-Jun-11 7:49am
   
Use the | character to seperate the expressions: "//input | //select | //textarea".

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



Advertise | Privacy | Mobile
Web01 | 2.8.140821.2 | Last Updated 25 Jun 2011
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100