Click here to Skip to main content
15,910,234 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi everyone,

I want to develop my OCR software. I want to know which programming language is suitable for this. Is Java good for this? Also, I need to know, what are the things that I should know to do this.

Thanks in advance,
wannasakthi
Posted
Updated 29-Jan-11 18:30pm
v2

Instead of writing your own OCR software, you can use any of the open source OCR tools available. check this for list of OCR softwares:
http://en.wikipedia.org/wiki/List_of_optical_character_recognition_software[^]
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 31-Jan-11 0:41am    
Om, this is a good reference - my 5;

However, in practice none of them is good enough. For a developer, I think learning some advanced techniques is better using CodeProject works I tried in the past -- please see.
--SA
I tried to find something workable from the list provided by Om — what I tried all have big problems; I could not find good solution working with Unicode. In my finding, Tesseract is closer to the Truth: http://en.wikipedia.org/wiki/Tesseract_(software)[^] :).

Best codes with source code I saw were in C# published on CodeProject, but probably all of them are not finished into ready-to-use application. Perhaps, this is best for your purposes. (I don't count those without full source code, relying on any commercial products — I was not interested in those.)

Please look:
Neural Network OCR[^], Creating Optical Character Recognition (OCR) applications using Neural Networks[^], OCR Line Detection[^], Unicode Optical Character Recognition[^] — only those out of some 60 CodeProject search results are real original works you can further develop. You will need to select base method, add or improve training, perform training and store training results, add recognition and alignment of lines, put all together, etc.

It would be great if you have success and share your results.

Good luck.
—SA
 
Share this answer
 
v3
Comments
Espen Harlinn 31-Jan-11 15:12pm    
Good links, 5+
Sergey Alexandrovich Kryukov 31-Jan-11 15:58pm    
Thank you, Espen,
I found it difficult to obtain reasonably good OCR. Developers of OCR seem to be good specialists in some narrow field of applied mathematics with having almost no ideas about programming. In particular, I failed to find anything good properly working with Unicode. Some of the basic CodeProject works turned out to be better, but all need completion; I never got time for that.
--SA
hind5 3-Mar-11 6:56am    
thank's my freind
wannasakthi 2-Feb-11 7:38am    
Thanks for your good reply..
Sergey Alexandrovich Kryukov 3-Mar-11 14:18pm    
You're welcome.
Thank you for accepting my Answer.
Good luck, call again,
--SA

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900