Click here to Skip to main content
15,910,083 members
Please Sign up or sign in to vote.
3.00/5 (1 vote)
See more:
Hi
I am writing a web crawler. It works correctly except the websites which use JavaScript instead of using HTML tags (a href). How can I solve this problem to be able to access all URLs in a website?

Thank you very much in advance.
Posted
Updated 17-Jun-10 10:45am
v3

1 solution

Maybe you need to parse JS source to get variables values and strings that contents urls or try to use regex for searching urls in strings.
maybe this can't be perfect if urls is encoded by using JS. but =) you decode it

try : http://timwhitlock.info/plug/examples/JavaScript/JParser.php
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900