Introduction
Optical Character Recognition (OCR) is a technology and
concept that is familiar to the majority of programmers: take a picture with
words and convert it to text. It sure sounds simple, but implementing it well is
often much harder than it looks. Much like watching a professional surfer and
trying to do it yourself, developers get bruised, tired and nearly drown in the
endless waves of images with varying fonts, bad scans, dust speckles and paper
crinkles finding new problems in your algorithm.
Save yourself some headache and use LEADTOOLS,
the most accurate, fast and easy to use OCR SDK on the market! With over
twenty years of programming experience, a powerful and extensive set of
document image cleanup functions, thread-safe OCR for over thirty languages and
the time and resources to test millions of images, LEADTOOLS has earned the
trust of Fortune 500 companies and individual contractors alike.
Programming with LEADTOOLS couldn’t be easier with high
level interfaces that can convert an image to a searchable PDF in only three
lines of code. For those who need additional control or wish to do more
advanced tasks such as using zones to read words and characters from specific sections
of a form, LEADTOOLS provides low level control over every aspect of your OCR
application.
Key Features in LEADTOOLS OCR SDKs
-
Fast and Accurate OCR with multithreaded support
-
Broad OCR language character set support including Latin,
Cyrillic, East Asian and Arabic
-
Save OCR results to over 40 output formats including searchable PDF,
PDF/A, Word and XML
-
Full page and zonal OCR
-
Built-in and custom spelling dictionaries to improve OCR results
-
Powerful document image cleanup and preprocessing functions to
improve OCR results of scanned images
-
32 and 64 bit OCR binaries
SDK Products that Include OCR Technology
The OCR Code
One of the most important characteristics of any SDK is ease
of use. This is a foundational concept for the developers of LEADTOOLS.
Here you can see how to convert an image to a searchable PDF in only three
lines of code:
IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false);
ocrEngine.Startup(null, null, null, null);
ocrEngine.AutoRecognizeManager.Run(_strInputFile, _strOutputFile, DocumentFormat.Pdf,
null, null);
The null parameters are for classes you can use to customize
the output and processing such as file format settings, document cleanup,
callbacks and more. Passing null uses default values which are optimized for
the majority of scanned documents. Here’s a screenshot of the original TIFF
image and searchable PDF created by this code:

Simple and fast solutions are great, but LEADTOOLS doesn’t
stop there because we understand that many projects also require customization
and more complex tasks. The LEADTOOLS OCR interface is also granular enough to
give control over every detail of the process including zones, processing words
and characters, even spell checking and modifying the recognition results if
necessary. Below, LEADTOOLS is used to recognize the text from a specific
rectangle drawn on the image by the user:
if (!rasterImageViewer1.Image.HasRegion)
{
MessageBox.Show("Select a zone in the viewer using the mouse.");
return;
}
using (IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage,
false))
{
ocrEngine.Startup(null, null, null, null);
using (IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument())
{
IOcrPage ocrPage = ocrDocument.Pages.AddPage(rasterImageViewer1.Image, null);
OcrZone ocrZone = new OcrZone();
ocrZone.Bounds = new LogicalRectangle(
rasterImageViewer1.Image.GetRegionBounds(null));
ocrZone.ZoneType = OcrZoneType.Text;
ocrPage.Zones.Add(ocrZone);
MessageBox.Show(ocrPage.RecognizeText(null));
}
}

Conclusion
LEADTOOLS provides developers with access to the world’s
best performing and most stable imaging libraries in an easy-to-use, high-level
programming interface enabling rapid development of business-critical
applications.
OCR is only one of the many technologies LEADTOOLS has to
offer. For more information on our other products, be sure to visit our home
page, download a free fully functioning evaluation SDK, and take advantage
of our free technical support during your evaluation.
Download the Full OCR Example
You can download a fully functional demo which includes the
features discussed above. To run this example you will need the following:
Support
Need help getting this sample up and going? Contact
our support team for free technical support! For pricing or licensing
questions, you can contact our sales team (sales@leadtools.com)
or call us at 704-332-5532.
About LEADTOOLS
LEAD Technologies has been the prominent provider of digital
imaging tools since 1990. Its award-winning LEADTOOLS family of toolkits helps
developers integrate raster, document, medical, multimedia, vector and Internet
imaging into their applications quickly and easily. Using LEADTOOLS for your
imaging requirements allows you to spend more time on user interface and
application-specific code, expediting your development cycle and increasing
your return on investment.
With a rich history of over twenty years, LEAD has established itself as the world's leading provider of software development toolkits for document, medical, multimedia, raster and vector imaging. LEAD's flagship product, LEADTOOLS, holds the top position in every major country throughout the world and boasts a healthy, diverse customer base and strong list of corporate partners including some of the largest and most influential organizations from around the globe. For more information, contact sales@leadtools.com or support@leadtools.com.