Click here to Skip to main content
15,889,867 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have a collection of webpages that i need to comb through to find values and have no idea where to start am pretty new to all this :)

<td><button value="Right" action="Guard" width="80" height="20"></button></td>


From inside this i need to extract the values from Button value and Action

the way i tried below works Ok for 1 table on 1 webpage, the other webpages are structure differently :S

constructive criticism is welcome :)

What I have tried:

C#
int _Counter1 = webBrowser1.Document.GetElementsByTagName("table")[14].GetElementsByTagName("td").Count;
            if (_Counter1 > 0)
            {
                for (int index1 = 0; index1 < _Counter1 - 1; index1++)
                {
                    try
                    {
                        string one = (webBrowser1.Document.GetElementsByTagName("table")[14].GetElementsByTagName("td")[1 + index1].InnerHtml);
                        string two = one.Split('"', '"')[9];
                        string three = one.Split('"', '"')[11];

                        if( one.Contains("Right") && one.Contains("Guard"))
                        {
                            richTextBox1.AppendText(three + " " + two + Environment.NewLine);
                        }
                                               
                    }
                    catch { }
                }
            }
Posted
Updated 9-Oct-17 5:10am

First store all the web pages in an array, redirect to every page and collect the information you require. In visual studio go to Nuget package manager and add the HTML agility package. Below are the links for your reference
Scraping HTML DOM elements using HtmlAgilityPack (HAP) in ASP.NET[^]
Getting Started With HTML Agility Pack[^]
 
Share this answer
 
Try a Regex:
(?<=<td><button value=")(?<Value>.*?)" action="(?<Action>.*?)(?=".*?></button></td>)
That will give you two groups: "Value" and "Action" containing the info.
 
Share this answer
 
As my luck goes i solved it a few mins after asking :S

C#
int _Counter1 = webBrowser1.Document.GetElementsByTagName("button").Count;
            if (_Counter1 > 0)
            {
                for (int index1 = 0; index1 < _Counter1 - 1; index1++)
                {
                    try
                    {
                        string one = webBrowser1.Document.GetElementsByTagName("button")[1 + index1].OuterHtml;
                        string two = one.Split('"', '"')[9];
                        string three = one.Split('"', '"')[11];
                        if (three.Length != 0)
                        {
                            richTextBox1.AppendText(three + " " + two + Environment.NewLine);
                            
                        }
                        //MessageBox.Show(one);
                    }
                    catch { }
                }
            }
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900