|
I've been rewriting code and taking things that are not time sensitive, and putting them in a thread.
I have an issue in which search bots come like yandex, in which it rips through about 4 pages in less than 1 second, and my threads can't keep up.
Is there a way to see if the thread with the same name is running so I can have the next thread wait a couple of seconds.
Dim updateXML As New Thread(New ThreadStart(AddressOf uX_Container.update_SearchBot_Record_XML_ThreadProc))
updateXML.Name = "update_SearchBotRecord"
updateXML.Priority = ThreadPriority.Lowest
updateXML.IsBackground = True
updateXML.Start()
|
|
|
|
|
I've had problems with Yandex (and Baidu) in the past, they both hit harder than google. Given that I had 0% interest in the Russian/Chinese market where I was, I blocked them. I tried using robots.txt, but they didn't play well.
You might find this[^] discussion helpful I went down the URLRewrite route, it worked pertty well, but needed IIS7 plus the module installed.
You might also like to read Bye bye Crawler Blocking the Parasites[^]
Of course, if you want to be listed from yandex, the above is really bad advice.
|
|
|
|
|
Thanks for the article, I need to know that stuff anyway. Google is actually pretty courteous crawling at a rate of about 2 minutes each.
I think my strategy is wrong here. Perhaps for crawlers and bots, I should make them wait for the whole process to complete on a single thread. But on human users, I can put data writes on background threads so there experience is faster.
|
|
|
|