Hi I am coding a web-crawler which will crawl the websites and selectively parse different sections of a web site.
I am a .Net developer so the choice was obvious that I did it in .Net but the speed was very slow which included downloading and parsing of HTMLPages
Then I tried to just download the contents first using .Net and then same domains using python but the python was very impressive in downloading data. I have achieved downloading using python but the later part is not that easy to code in python, which obviously i don't want to do.
The same batch of domain which took 100 seconds in Python
was taking 20 minutes in .Net based crawler
I tried http://www.regexhacks.com/ to download and in took 10 seconds in Python and same was taking 2 minutes in .Net crawler
Does anyone anyone have any idea why this is slow in .Net but fast in python?