Click here to Skip to main content
15,905,508 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
please guide me how to crawl websites using C# then store the data in sqlserver database...

I need to crawl websites in Arabic language to start use some data mining techniques.

thanks
Posted

Have you looked at the several open source crawlers made in c#, which can be easily found with google? No? Well, you should...
You could start here:
https://github.com/sjdirect/abot/[^]
A Simple Crawler Using C# Sockets[^]
http://ericsowell.com/blog/2007/8/14/how-to-write-a-web-crawler-in-csharp[^]
ans so on...
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 28-Feb-15 20:30pm    
Makes sense, a 5.
—SA
There are some open source crawler with c# in net, for example:

A Simple Crawler Using C# Sockets[^]
https://abot.codeplex.com/[^]
https://code.google.com/p/abot/[^]

but, if you want to learn and coding for its, you should do:

1- study about web request and response
2- get html source for first url
3- search in html and find tags with links, for example a with href
4- parse them and select and save in DB

finally i suggest to study sample code after coding.
 
Share this answer
 
But I need to extract only some parts of the page like:
news title
news image
news details
news date&time

not all the pages,,,
 
Share this answer
 
Comments
RedDk 1-Mar-15 20:11pm    
Sorry about that downvote but using the edit for the post or replying to the comment is, as you know, the proper way to proceed here @ CP.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900