Click here to Skip to main content
15,940,550 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
hello dear freinds
i have a problem with HTmlAgilityPac. i have a pege like this:
HTML
<div Class="bd"><h3 class=""><a class="title main-headline" href="test">abolfaz jason</a></h3></div>


now i need access content of inside in a href ("abolfazl jason"). my code is:
C#
Dim doc As New HtmlAgilityPack.HtmlDocument()
doc.LoadHtml("linkurl...")
Dim nodes As HtmlNode = doc.DocumentNode.SelectSingleNode("//div[@class='bd']//a")
Response.Write(nodes.InnerText)


but application run has an error: "
C#
Object reference not set to an instance of an object.
"

can you help me?

What I have tried:

Problem with HTmlAgilityPack trying to get a href content
Posted
Updated 11-Sep-16 22:00pm
Comments
[no name] 11-Sep-16 11:23am    
What is it that you need help with? A simple google search would tell you what the error means and how to fix it. A site search would tell you what the error means and how to fix it. Running your code through the debugger would tell you what the error means and how to fix it.
abolfazl133 11-Sep-16 11:27am    
no its not not simple! i have a serious problem. if you know response please write correct code!
[no name] 11-Sep-16 13:19pm    
Yes, actually it is that simple.

Your code works. I took your HTML string and saved it to c:\temp\test.html on my local machine.

C#
var html = @"c:\temp\test.html";

            var doc = new HtmlDocument();
            doc.Load(html);

            var nodes = doc.DocumentNode.SelectSingleNode("//div[@class='bd']//a");

            Console.WriteLine(nodes.InnerText);


However, i see in your code you have

C#
doc.LoadHtml("linkurl...")


Are you trying to download the HTML data and read it from there? This may be your problem.

You should use WebClient to download the link. Something like this.

C#
var client = new WebClient();
            client.DownloadFile("http://link/to/your/stuff", @"c:\local\filename.html");


You could also just use .DownLoad data, transform those bytes into a stream reader and user the HtmlDocument Load methods overload for loading HTML from a Stream.

But given that the code works from a local file. I believe your null reference is due to trying to treat a URL as a local file.
 
Share this answer
 
Dear David i test your solution. i change my code a little. i think it is better you see my code and main url and code:

C#
Dim Scr1 As New HtmlWeb()
Dim Url1 = Scr1.Load("https://www.linkedin.com/vsearch/p?openAdvancedForm=true&locationType=Y&f_I=47&rsid=4367283831473569842080&orig=ADVS")
Dim ournone As HtmlNode = Url1.DocumentNode.SelectSingleNode("//div[@class='bd']//a")
Response.Write(ournone.InnerHtml)


i have problem yet! please help me
 
Share this answer
 
Comments
David_Wimbley 11-Sep-16 12:09pm    
Should have left this as a comment and not another solution.

So your first problem is what i told you in the solution provided. .Load() doesn't have an overload for parsing HTML from a URL. It looks like you now changed it from HtmlDocument to HtmlWeb.

Now your code won't work period as HtmlWeb doesn't have a property on it called DocumentNode.

You need to spend some time googling htmlagilitypack examples. In the solution i provided i told you how to go about utilizing URL's if that is what you are using.
abolfazl133 11-Sep-16 12:16pm    
yes dear david
please see my code now. it is work:
Dim Scr1 As New HtmlWeb()
Dim Url1 = Scr1.Load("https://www.linkedin.com/vsearch/p?openAdvancedForm=true&locationType=Y&f_I=47&rsid=4367283831473569842080&orig=ADVS")
Dim ournone As HtmlNode = Url1.DocumentNode.SelectSingleNode("//div[@class='wrapper']")
Response.Write(ournone.InnerHtml)

its response: LinkedIn Corporation © 2016

but i need get "abolfazl jason" value in the linkedin code:
<h3 class=""><a class="title main-headline" href="https://www.linkedin.com/profile/">abolfazl json</a><span class="badges"><span><abbr aria-hidden="true" title="abolfazl json is a 2nd degree contact" class="degree-icon ">2<sup>nd</sup></abbr></span></span></h3>
so i must write this code that not work. i think the problem is from expression inside SelectSingleNode

Dim Scr1 As New HtmlWeb()
Dim Url1 = Scr1.Load("https://www.linkedin.com/vsearch/p?openAdvancedForm=true&locationType=Y&f_I=47&rsid=4367283831473569842080&orig=ADVS")
Dim ournone As HtmlNode = Url1.DocumentNode.SelectSingleNode("//div[@class='bd']//a")
Response.Write(ournone.InnerHtml)
David_Wimbley 11-Sep-16 12:20pm    
This is a serious question but are you reading my replies? What you need to do to download from a URL is posted in my solution.
abolfazl133 11-Sep-16 12:24pm    
Dear david, i think i cant explain my problem clearly. this is my problem:

a) i have a url:
https://www.linkedin.com/vsearch/p?openAdvancedForm=true&locationType=Y&f_I=47&rsid=4367283831473569842080&orig=ADVS

b) inside this url there are a html code like this:
<h3 class=""><a class="title main-headline" href="https://www.linkedin.com/profile/">abolfazl json</a><span class="badges"><span><abbr aria-hidden="true" title="abolfazl json is a 2nd degree contact" class="degree-icon ">2<sup>nd</sup></abbr></span></span></h3>

c) i need a .net code by HtmlAgilityPack that can return "abolfazl json" text inside a href tag for me.
hello dear friend. i have a serious problem with Html Agility Pack. please see my code:

the first code when run on a saved file on system work correct but not work with the same content in a online url and return empty values:

is work empty value!
C#
Dim Scr1 As New HtmlWeb()
Dim Url1 = Scr1.Load("https://www.linkedin.com/in/ladan-sahraei-14461b34?authType=OUT_OF_NETWORK&authToken=tK-W&locale=en_US&srchid=4367283831473659871577&srchindex=1&srchtotal=6693647&trk=vsrp_people_res_name&trkInfo=VSRPsearchId%3A4367283831473659871577%2CVSRPtargetId%3A120972292%2CVSRPcmpt%3Aprimary%2CVSRPnm%3Afalse%2CauthType%3AOUT_OF_NETWORK")
Dim ournone As HtmlNode = Url1.DocumentNode.SelectSingleNode("//span[@class='full-name']")
Response.Write(ournone.InnerHtml)


is work correct!
C#
Dim Scr1 As New HtmlWeb()
Dim Url1 = Scr1.Load("http://localhost:21374/HtmlPage.html")
Dim ournone As HtmlNode = Url1.DocumentNode.SelectSingleNode("//span[@class='full-name']")
Response.Write(ournone.InnerHtml)
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900