Background
The article continues the topic of image-based CAPTCHAs that was started in my previous article, The Image-Based CAPTCHA. Please look through it in order to get some idea about this article's subject.
Image-Based Bot Detector
I made a control - I named it ImageBasedBotDetector - that implemented the idea stated in the previous article. Its code differs from the example posted before, basically because of the separation of the code that renders the control and the code that renders the CAPTCHA image (it is replaced with an HttpHandler). For use, it needs to set TemplateImageFolder to a path where template images are located, and add the registration of the HttpHandler in the web.config.
<httpHandlers>
<add verb="GET" path="ImageBasedBotDetector.ashx"
type="Marss.Web.UI.Controls.ImageBasedBotDetector, Marss.Web"/>
</httpHandlers>
ImageBasedBotDetector can be used as either a finished control, or as a base control if you want to add your own functionality - the GetTemplateImage and DrawCustom methods are virtual and can be overridden. An example of use you can be found in the demo project #1.
A few answers on your comments
After the previous article publication, I got many comments concerning the reliability of this type of CAPTCHAs. I'll try to systematize and answer them.
- It is possible to make a system that can recognize images and find the distorted part.
Quite possible, no doubt. If it is possible to distinguish between a real person and a fake one on a photo, then it can be done for specific distortions on a picture. Most of the text-based CAPTCHAs are also crackable in 90-99% of the cases, but it does not prevent their general use. The fact is that there is no system that can crack everything; a specific implementation is required in every concrete case. Specific implementations mean resources and money. So, don't worry if your site is not as popular as Yahoo! or Google.
- A solution that relies on JavaScript is not firm because the script can be disabled.
Sure, this is applicable for about 4% of visitors. But, it is hard to imagine a present-day web application that offers site-to-visitor interaction and does not use JavaScript. Besides, if you use ASP.NET controls, then most probably you already have these visitors excluded.
- Insufficient accessibility.
The control now offers three types of distortion: Stretched, Random, and Volute (see picture).
Also, you can implement your own distortion if you set DistortionType to Custom and override the DrawCustom method.
- Possibility to download template images and compare them to an image that is generated by the CAPTCHA.
To avoid pixel-to-pixel comparison, an original image is slightly distorted too. As for more complicated comparisons, read paragraph #1.
Besides, a reconstruction of template images set can be avoided if you don't use any images set. It sounds strange, but actually it is rather simple. The basic principle is that though images of specific themes are required, there is no need to sort them manually. So you can use a search engine, for example Google Images search. Select a theme, for example, "Landscape", and choose the appropriate search keywords. Then, override the GetTemplateImage method.
public class AdvancedImageBasedBotDetector :
Marss.Web.UI.Controls.ImageBasedBotDetector
{
private const string requestPattern =
"http://images.google.com/images?hl=en&q=" +
"{0}&gbv=2&svnum=10&start={1}&sa=N&ndsp=20";
protected override System.Drawing.Image GetTemplateImage()
{
string[] keywords = new string[] { "forest", "mountains", "waterfall",
"falls", "hills", "lake"};
Random r = new Random();
string url = string.Format(requestPattern,
keywords[r.Next(0, keywords.Length)], r.Next(0, 50));
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
using (HttpWebResponse resp = (HttpWebResponse)req.GetResponse())
{
if (resp.StatusCode == HttpStatusCode.OK)
{
string html;
using (StreamReader sr = new StreamReader(resp.GetResponseStream()))
{
html = sr.ReadToEnd();
resp.Close();
sr.Close();
}
Regex re = new Regex(@"dyn\.Img[(](?:""([^""]*)""[,)])*",
RegexOptions.Multiline);
MatchCollection matches = re.Matches(html);
if (matches.Count > 0)
{
Match match = matches[r.Next(0, matches.Count)];
string imageUrl = string.Format("{0}?q=tbn:{1}{2}",
match.Groups[1].Captures[14],
match.Groups[1].Captures[2],
match.Groups[1].Captures[3]);
HttpWebRequest req2 = (HttpWebRequest)WebRequest.Create(imageUrl);
using (HttpWebResponse resp2 = (HttpWebResponse)req2.GetResponse())
{
if (resp2.StatusCode == HttpStatusCode.OK)
{
System.Drawing.Image im =
System.Drawing.Image.FromStream(resp2.GetResponseStream());
resp2.Close();
return im;
}
}
}
}
}
return base.GetTemplateImage();
}
}
A working example can be seen in the demo project #2 or on my site.
- What is this necessary for?
If the answer "pleasant variety" does not suit you, then - I guess - it may to say that this type of CAPTCHAs are more user friendly (one mouse click vs. typing 5-6 letters).
That is all. Start to criticize :).
Other Links