Click here to Skip to main content
15,860,859 members
Articles / Programming Languages / XML
Article

OCR with LEADTOOLS: The Better Choice

23 Sep 2014CPOL4 min read 68.9K   4.4K   25   2
Why implementing Optical Character Recognition with Leadtools ODR SDKs is the faster, more accurate choice.

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

Introduction

Optical Character Recognition (OCR) is a technology and concept that is familiar to the majority of programmers: take a picture with words and convert it to text. It sure sounds simple, but implementing it well is often much harder than it looks. Much like watching a professional surfer and trying to do it yourself, developers get bruised, tired and nearly drown in the endless waves of images with varying fonts, bad scans, dust speckles and paper crinkles finding new problems in your algorithm.

Save yourself some headache and use LEADTOOLS, the most accurate, fast and easy to use OCR SDK on the market! With over twenty years of programming experience, a powerful and extensive set of document image cleanup functions, thread-safe OCR for over thirty languages and the time and resources to test millions of images, LEADTOOLS has earned the trust of Fortune 500 companies and individual contractors alike.

Programming with LEADTOOLS couldn’t be easier with high level interfaces that can convert an image to a searchable PDF in only three lines of code. For those who need additional control or wish to do more advanced tasks such as using zones to read words and characters from specific sections of a form, LEADTOOLS provides low level control over every aspect of your OCR application.

Key Features in LEADTOOLS OCR SDKs

  • Fast and Accurate OCR with multithreaded support
  • Broad OCR language character set support including Latin, Cyrillic, East Asian and Arabic
  • Save OCR results to over 40 output formats including searchable PDF, PDF/A, Word and XML
  • Full page and zonal OCR
  • Built-in and custom spelling dictionaries to improve OCR results
  • Powerful document image cleanup and preprocessing functions to improve OCR results of scanned images
  • 32 and 64 bit OCR binaries

SDK Products that Include OCR Technology

The OCR Code

One of the most important characteristics of any SDK is ease of use. This is a foundational concept for the developers of LEADTOOLS. Here you can see how to convert an image to a searchable PDF in only three lines of code:

IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false);
ocrEngine.Startup(null, null, null, null);
ocrEngine.AutoRecognizeManager.Run(_strInputFile, _strOutputFile, DocumentFormat.Pdf, 
    null, null);

The null parameters are for classes you can use to customize the output and processing such as file format settings, document cleanup, callbacks and more. Passing null uses default values which are optimized for the majority of scanned documents. Here’s a screenshot of the original TIFF image and searchable PDF created by this code:

OCR-Leadtools/image001.jpg

Simple and fast solutions are great, but LEADTOOLS doesn’t stop there because we understand that many projects also require customization and more complex tasks. The LEADTOOLS OCR interface is also granular enough to give control over every detail of the process including zones, processing words and characters, even spell checking and modifying the recognition results if necessary. Below, LEADTOOLS is used to recognize the text from a specific rectangle drawn on the image by the user:

// make sure the region isn't empty or the size of the entire image
if (!rasterImageViewer1.Image.HasRegion)
{
   MessageBox.Show("Select a zone in the viewer using the mouse.");
   return;
}
 
// Create OCR Engine
using (IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage,
   false))
{
   // Start the engine using default parameters
   ocrEngine.Startup(null, null, null, null);
 
   // Create OCR Document
   using (IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument())
   {
      // Add image from the viewer as a page in this document
      IOcrPage ocrPage = ocrDocument.Pages.AddPage(rasterImageViewer1.Image, null);
 
      // Create a zone for the selected region
      OcrZone ocrZone = new OcrZone();
      ocrZone.Bounds = new LogicalRectangle(
          rasterImageViewer1.Image.GetRegionBounds(null));
      ocrZone.ZoneType = OcrZoneType.Text;
      ocrPage.Zones.Add(ocrZone);
 
      // OCR the image and display text in a MessageBox
      MessageBox.Show(ocrPage.RecognizeText(null));
   }
}

OCR-Leadtools/image002.jpg

Conclusion

LEADTOOLS provides developers with access to the world’s best performing and most stable imaging libraries in an easy-to-use, high-level programming interface enabling rapid development of business-critical applications.

OCR is only one of the many technologies LEADTOOLS has to offer. For more information on our other products, be sure to visit our home page, download a free fully functioning evaluation SDK, and take advantage of our free technical support during your evaluation.

Download the Full OCR Example

You can download a fully functional demo which includes the features discussed above. To run this example you will need the following:

Support

Need help getting this sample up and going? Contact our support team for free technical support! For pricing or licensing questions, you can contact our sales team (sales@leadtools.com) or call us at 704-332-5532.

About LEADTOOLS

LEAD Technologies has been the prominent provider of digital imaging tools since 1990. Its award-winning LEADTOOLS family of toolkits helps developers integrate raster, document, medical, multimedia, vector and Internet imaging into their applications quickly and easily. Using LEADTOOLS for your imaging requirements allows you to spend more time on user interface and application-specific code, expediting your development cycle and increasing your return on investment.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Help desk / Support LEAD Technologies, Inc.
United States United States
Since 1990, LEAD has established itself as the world's leading provider of software development toolkits for document, medical, multimedia, raster and vector imaging. LEAD's flagship product, LEADTOOLS, holds the top position in every major country throughout the world and boasts a healthy, diverse customer base and strong list of corporate partners including some of the largest and most influential organizations from around the globe. For more information, contact sales@leadtools.com or support@leadtools.com.
This is a Organisation (No members)


Comments and Discussions

 
QuestionLost me very quickly Pin
Mark_Wallace24-Sep-14 22:37
Mark_Wallace24-Sep-14 22:37 
QuestionIn my experience ocr with leadtools is the BEST choice Pin
Member 83852371-Feb-12 19:43
Member 83852371-Feb-12 19:43 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.