I tried to find something workable from the list provided by Om — what I tried all have big problems; I could not find good solution working with Unicode. In my finding, Tesseract is closer to the Truth:
http://en.wikipedia.org/wiki/Tesseract_(software)[
^] :).
Best codes with source code I saw were in C# published on CodeProject, but probably all of them are not finished into ready-to-use application. Perhaps, this is best for your purposes. (I don't count those without full source code, relying on any commercial products — I was not interested in those.)
Please look:
Neural Network OCR[
^],
Creating Optical Character Recognition (OCR) applications using Neural Networks[
^],
OCR Line Detection[
^],
Unicode Optical Character Recognition[
^] — only those out of some 60 CodeProject search results are real original works you can further develop. You will need to select base method, add or improve training, perform training and store training results, add recognition and alignment of lines, put all together, etc.
It would be great if you have success and share your results.
Good luck.
—SA