Introduction

LEADTOOLS is the #1 imaging toolkit in the world and has earned its place on top by consistently delivering imaging components of the highest quality, performance and stability in a format that is “programmer friendly”. Developers are able to significantly reduce time-to-market for their applications, thereby maximizing productivity and ensuring the greatest possible return on investment.

LEADTOOLS has an all new design that greatly simplifies development without sacrificing control. One important enhancement is the set of high level .NET classes available for enabling Optical Character Recognition (OCR) of scanned images. This new architecture is intuitive, flexible and incredibly easy to follow. A programmer can enable image OCR functionality in as little as three lines of code, while maintaining the necessary level of control required by the specific application or workflow.

In this article, we will introduce you to the key features of the new .NET OCR classes, provide you with a step-by-step approach for creating an OCR application, and provide you with sample code. Feel free to try it out for yourself by downloading a fully functional evaluation SDK from the links provided below.

Key Features

LEADTOOLS provides methods to:

Recognize and export text, choosing from a variety of text, word processing, database, or spreadsheet file formats.
Perform OCR processes in a single or multi-threaded environment with optimization for server-based operations.
Multiple OCR engines are supported and abstracted from the user through the use of a common .NET class library. Switching between the various engines requires virtually no changes in the application code.
Select the language of documents to be recognized. Choose from English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, or Swedish.
Segment complex pages manually or automatically into text zones, image zones, table zones, lines, headers and footers.
Set accuracy thresholds prior to recognition to control the accuracy of recognition.
Learn, save, and load character recognition data for similar documents. The software learns as a result of normal recognition, and acquires additional information by using the OCR’s text verification system.
Recognize text from 5 to 72 points in virtually any typeface.
Increase recognition accuracy with built-in and user dictionaries.
Automatically detect fax, dot matrix, and other degraded documents and compensate accordingly.
Process both text and graphics. The recognition software's ability to distinguish halftone graphics from text can provide the basis of a compound document processing system.
Save the document in any of 40 formats, including Adobe PDF and PDF/A, MS Word, MS Excel as well as various flavors of ASCII and UNICODE text.

Environment

The LEADTOOLS OCR .NET class library comes in Win32 and x64 editions that can support development of software applications for any of the following environments:

Windows 8 (32 and 64-bit editions)
Windows 7 (32 and 64-bit editions)
Windows 2008 (32 and 64-bit editions)
Windows Vista (32 and 64-bit editions)
Windows XP (32 and 64-bit editions)
Windows 2000

Samples provided will work in Visual Studio 2005 or Visual Studio 2008.

How LEADTOOLS OCR Works

LEADTOOLS uses an OCR handle to interact with the OCR engine and the OCR document containing the list of pages. The OCR handle is a communication session between LEADTOOLS OCR and an OCR engine installed on the system. This OCR handle is an internal structure that contains all the necessary information for recognition, getting and setting information, and text verification.

The following is an outline of the general steps involved in recognizing one or more pages. For a more detailed explanation, download the LEADTOOLS evaluation and refer to the “Programming with LEADTOOLS .NET OCR” topic in the .NET help:

Select the engine type you wish to use and create an instance of the IOcrEngine interface.
Startup the OCR Engine with the IOcrEngine.Startup method.
Establish an OCR document with one or more pages.
Establish zones on the page(s), either manually or automatically. (This is optional. A page can be recognized with or without zones.)
Optional. Set the active languages to be used by the OCR engine. (The default is English).
Optional. Set the spell checking language. (The default is English).
Optional. Set any special recognition module options. This is required only if the page contains zones, created either automatically or manually.
Recognize.
Save recognition results, if desired. The results can be saved to either a file or to memory.
Shut down the OCR engine when finished.

Where steps 4, 5, 6, and 7 can pretty much be done in any order, as long as they are carried out after starting up the OCR engine and before recognizing a page.

You can start using LEADTOOLS for .NET OCR in your application by adding a reference to the Leadtools.Forms.Ocr.dll assembly into your .NET application. This assembly contains the various interfaces, classes, structures and delegates used to program with LEADTOOLS OCR.

Since the toolkit supports multiple engines, the actual code that interfaces with the engine is stored in a separate assembly that will be loaded dynamically once an instance of the IOcrEngine interface is created. Hence, you must make sure the engine assembly you are planning to use resides next to the Leadtools.Forms.Ocr.dll assembly. You can add the engine assembly as a reference to your project if desired to automatically detect dependencies, even though this is not required by LEADTOOLS.

The Code

The following example shows how to perform the above steps in code:

Visual Basic

' *** Step 1: Select the engine type and 
' create an instance of the IOcrEngine interface.
' We will use the LEADTOOLS OCR Plus engine and use it in the same process
Dim ocrEngine As IOcrEngine = _ 
    OcrEngineManager.CreateEngine(OcrEngineType.Plus, False)

' *** Step 2: Startup the engine.
' Use the default parameters 
ocrEngine.Startup(Nothing, Nothing, Nothing)


' *** Step 3: Create an OCR document with one or more pages.
Dim ocrDocument As IOcrDocument = _ 
    ocrEngine.DocumentManager.CreateDocument()

' Add all the pages of a multi-page TIF image to the document
ocrDocument.Pages.AddPages("C:\Images\Ocr.tif", 1, -1, Nothing)

' *** Step 4: Establish zones on the page(s), either manually or automatically
' Automatic zoning
ocrDocument.Pages.AutoZone(Nothing)

' *** Step 5: (Optional) Set the active languages to be used by the OCR engine
' Enable English and German languages
ocrEngine.LanguageManager.EnableLanguages(New String() {"en", "de"})

' *** Step 6: (Optional) Set the spell checking language
' Enable the spell checking system and set English as the spell language
ocrEngine.SpellCheckManager.Enabled = True
ocrEngine.SpellCheckManager.SpellLanguage = "en"
 
' *** Step 7: (Optional) Set any special recognition module options

' Change the fill method for the first zone in the first page to be Omr
Dim ocrZone As OcrZone = ocrDocument.Pages(0).Zones(0)
ocrZone.FillMethod = OcrZoneFillMethod.Omr
ocrDocument.Pages(0).Zones(0) = ocrZone

' *** Step 8: Recognize
ocrDocument.Pages.Recognize(Nothing)

' *** Step 9: Save recognition results
' Save the results to a PDF file
ocrDocument.Save("C:\\Images\Document.pdf", OcrDocumentFormat.PdfA, Nothing)
ocrDocument.Dispose()

' *** Step 10: Shut down the OCR engine when finished
ocrEngine.Shutdown()
ocrEngine.Dispose()

C#

// *** Step 1: Select the engine type and 
// create an instance of the IOcrEngine interface.
// We will use the LEADTOOLS OCR Plus engine and use it in the same process
IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Plus, false);

// *** Step 2: Startup the engine.
// Use the default parameters
ocrEngine.Startup(null, null, null);

// *** Step 3: Create an OCR document with one or more pages.
IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument();

// Add all the pages of a multi-page TIF image to the document
ocrDocument.Pages.AddPages(@"C:\Images\Ocr.tif", 1, -1, null);

// *** Step 4: Establish zones on the page(s), either manually or automatically
// Automatic zoning
ocrDocument.Pages.AutoZone(null);

// *** Step 5: (Optional) Set the active languages to be used by the OCR engine
// Enable English and German languages
ocrEngine.LanguageManager.EnableLanguages(new string[] { "en", "de"});

// *** Step 6: (Optional) Set the spell checking language
// Enable the spell checking system and set English as the spell language
ocrEngine.SpellCheckManager.Enabled = true;
ocrEngine.SpellCheckManager.SpellLanguage = "en";

// *** Step 7: (Optional) Set any special recognition module options
// Change the fill method for the first zone in the first page to be default
OcrZone ocrZone = ocrDocument.Pages[0].Zones[0];
ocrZone.FillMethod = OcrZoneFillMethod.Default;
ocrDocument.Pages[0].Zones[0] = ocrZone;

// *** Step 8: Recognize
ocrDocument.Pages.Recognize(null);

// *** Step 9: Save recognition results
// Save the results to a PDF file
ocrDocument.Save(@"C:\Images\Document.pdf", OcrDocumentFormat.PdfA, null);
ocrDocument.Dispose();

// *** Step 10: Shut down the OCR engine when finished
ocrEngine.Shutdown();
ocrEngine.Dispose();

Finally, the following sample shows how to perform the same task above using the one shot "fire and forget" IOcrAutoRecognizeManager interface:

Visual Basic

' Create the engine instance
Using ocrEngine As IOcrEngine = _ 
    OcrEngineManager.CreateEngine(OcrEngineType.Plus, False)
    ' Startup the engine
    ocrEngine.Startup(Nothing, Nothing, Nothing)
    ' Convert the multi-page TIF image to a PDF document
    ocrEngine.AutoRecognizeManager.Run( _
        "C:\Images\Ocr.tif", _
        "C:\Images\Document.pdf", _
        Nothing, _
        OcrDocumentFormat.PdfA, _
        Nothing)
End Using

C#

// Create the engine instance
using (IOcrEngine ocrEngine = 
    OcrEngineManager.CreateEngine(OcrEngineType.Plus, false))
{
    // Startup the engine
    ocrEngine.Startup(null, null, null);

    // Convert the multi-page TIF image to a PDF document
    ocrEngine.AutoRecognizeManager.Run(
        @"C:\Images\Ocr.tif",
        @"C:\Images\Document.pdf",
        null,
        OcrDocumentFormat.PdfA,
        null);
}

Conclusion

LEADTOOLS provides developers with access to the world’s best performing and most stable imaging libraries in an easy-to-use, high-level programming interface enabling rapid development of business-critical applications. The new design will simplify the development effort, without sacrificing the level of control dictated by the specific application.

As demonstrated by the samples above, LEAD’s new high level OCR interface and design provide a logical and flexible approach to converting scanned images to editable and searchable documents. Classes are provided to allow you to control the entire process, or you can simply start the engine and convert any of the 150+ supported image formats to all common document formats with a single method call.

OCR is one of the many things LEADTOOLS has to offer. For more information be sure to visit our home page and download a free fully functioning evaluation SDK.

Required Software to Build this Sample

LEADTOOLS provides several toolkits, add-ons and cost-saving product bundles that provide its award-winning OCR technology. We recommend either Recognition Imaging or Document Imaging Suite, which include the Document Imaging SDK and all the required add-ons for OCR and searchable PDF output. For more options, please contact our sales department.

Or if you want to try it before you make a purchasing decision, you can download the free 60 day fully functional evaluation for LEADTOOLS.

Support

Need help getting this sample up and going? Contact our support team for free evaluation support!