Click here to Skip to main content
15,789,698 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi All,

I am working on my final year project. It is about CAPTCHA breaking. i.e. To show that captchas are easily broken.
For that following steps need to be followed:

1. Trace the position of characters in the input image.
2. Extract each character. We can also say it as character segmentation. i.e. Separating each character individually.

I have completed my project till this step

Now we have:
1. The segmented images.
2. A training set of images. i.e. A set of images each representing an individual character.

What I want to do now:

1. Maintain a database containing 2 columns. One representing the individual character. i.e. From a to z. Second representing the value obtained after applying some function on the corresponding training image.

2. After applying the same function on the segmented image, which we have obtained in step 2, we'll get their values also. These values will be compared in the database and the corresponding character will be fetched. In this way we can get the text corresponding to that image.

I want to know which function should I apply on the training set of images to get a single and unique value of each training image.
Updated 1-May-11 9:55am
HimanshuJoshi 1-May-11 14:56pm    
Edited to correct grammar and punctuation.

I don't think CAPTCHA can be easily breakable. At least, it looks like it is completely unbreakable for you, at least at this moment.

I think this is because you're trying to apply training-based approach not to regular OCR problem but to CAPTCHA. What training can you apply if CAPTCHA can mangle the characters in different ways on every update? In my opinion, you underestimate the complexity of your problem.

Share this answer
incredible me 1-May-11 15:46pm    
you are right,,,that more cluttered CAPTCHA can't b easily breakable...but we r basically implementing a research paper in which they have shown a "typical" mathematical approach to break captcha...We are just interested in breaking very low level captcha....
basically I just wanted to know if there is any function that can be applied on an image or its pixel array so that we can use the value returned by that function to compare the segmented target image with the set of training image....
Sergey Alexandrovich Kryukov 1-May-11 15:53pm    
I don't even talk about cluttering, I'm talking about shape of the later. It is usually it is at least slightly mangled. Well, I never saw even "low level CAPTCHA" which does not do it. So, I don't even understand how teaching can be applied. Perhaps I'm wrong...
incredible me 1-May-11 15:58pm    
no, You are not getting what I am trying to say..In very simple words,I just wanted to know how to compare two images using..that's it..
I am not asking for how to break CAPTCHA,,that was just to give a brief intro of the project...nothing more than that..
Sergey Alexandrovich Kryukov 1-May-11 16:03pm    
I understand that, but I cannot answer how to compare as I don't have enough information on what are you trying to do and what's the difficulty in you code you did not show.
i am having the same problem as you mensioned before i determined where is the character in the image and i specified its hight and width all the rest is about how to extract this charcters and recognize it if you found solution pleaze share it with me
Share this answer

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900