Click here to Skip to main content
15,904,348 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
hi there,

I have download a html page and try to replace all the href value and scr value to a certain domain, I have tried to use JQuery to do so, but SEO has index the source code in original href link, which is not good for search engine. Therefore, I have to do it at backend.

My Question is after i download the html page as a string, how can I replace all the href and scr value to my domain link, I was thinking about doing it by Linq, any idea will be helpful, Thanks.

What I have tried:

i try
XDocument doc = XDocument.parse(html);
,it has throw an exception, I think the tag format is not standardize.
Posted
Updated 30-Jan-18 5:26am
Comments
F-ES Sitecore 30-Jan-18 11:27am    
As suggested, use the agility pack or use regex. If you google regex to replace href you'll find lots of examples. Don't use xml and you don't really need linq either.

1 solution

You shouldn't use XML parsers for reading/writing HTML, because valid HTML is not necessarily valid XML. Use an actual HTML parser instead, for example the Html Agility Pack: Html Agility Pack | HAP[^]
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900