I am pulling data from a website where it is organized in a table. The first two rows look like this (I deleted some style info):
<table id="loads">
<thead>
<tr class="tableHeading">
<th><a original='Load ID'></a></th>
<th><a original='# of cars'></a></th>
<th><a original='Year/Make/Model'></a></th>
<th><a original='Origin City'></a></th>
<th><a original='Origin State'></a></th>
<th><a original='Destination City'></a></th>
<th><a original='Destination State'></a></th>
<th><a original='Mileage'></a></th>
<th><a original='Price per Shipment'></a></th>
<th><a original='Price per Mile'></a></th>
<th>View</th>
<th><a original='Comments'></a></th>
</tr>
</thead>
<tbody>
<tr>
<td>123456789</td>
<td>1</td>
<td>2015 GMC TERRAIN SLE</td>
<td>Los Angeles</td>
<td>CA</td>
<td>San Francisco</td>
<td>CA</td>
<td>400</td>
<td>$400</td>
<td>$1</td>
<td>
<a href="/ViewLoad.asp?nload_id=123456789&npickup_code=">
<img src="/images/icons/view.gif" >
</a>
</td>
<td>Some Text</td>
</tr>
There are 12 cells per row - all strings except for the 11th, which is one of the main reasons i am posting this question.
What I have tried:
I created a class that has 13 string properties. The extra one (which i made the first) is a Status property which will be New or Old. Later I am going to do some things with New rows, but that is not my issue right now.
So now i want to grab the innertext of each cell (except 11) and assign the string into an array. Here are my steps:
string collect = webBrowser1.Document.Body.InnerHtml;
string data = WebUtility.HtmlDecode(collect);
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(data);
HtmlNodeCollection rows = htmlDoc.DocumentNode.SelectNodes("//table[@id='loads']//tbody//tr");
Note - I checked up until this point, and so far all of this works, and the rows collection is collecting all of the rows in the table except the header (I only showed one non-header row above, but there are many).
On the next step I get lost. I am trying to get the cell strings into a string array, and into a bindinglist that is set up at the form level:
BindingSource source = new BindingSource(); BindingList<Load> list = new BindingList<Load>();
BindingList<Load> listDeleted = new BindingList<Load>();
List<Load> sortList = new List<Load>();
Here is my code:
int rowIndex = 0;
foreach (HtmlNode row in rows)
{
int columnIndex = 0;
string[] rowData = new string[13];
foreach (HtmlNode cell in row.ChildNodes)
{
if (columnIndex != 0 && columnIndex != 11)
{
rowData[columnIndex - 1] = cell.InnerText;
}
rowData[11] = cell.FirstChild.Attributes["href"].Value;
MessageBox.Show(rowData[11]);
columnIndex++;
}
Load newLoad = new Load(rowData);
if (!list.Contains(newLoad) && !listDeleted.Contains(newLoad))
{
list.Add(newLoad);
updated = true;
}
else
{
int itemIndex = list.IndexOf(newLoad);
if (itemIndex > 0)
{
if (!list[itemIndex].Comments.Equals(newLoad.Comments))
{
list[itemIndex].Comments = newLoad.Comments;
list[itemIndex].Status = "MODIFIED";
updated = true;
}
}
}
rowIndex++;
}
}
I am not sure what i am doing wrong in this last code block - and greatly appreciate any help.