Click here to Skip to main content
Click here to Skip to main content

OCR with Microsoft® Office

By , 26 Oct 2007
 

Introduction

Optical Character Recognition (OCR) extracts text and layout information from document images. With the help of Microsoft Office Document Imaging Library (MODI), which is contained in the Office 2003 package, you can easily integrate OCR functionality into your own applications. In combination with the MODI Document Viewer control, you will have complete OCR support with only a few lines of code.

Important note: MS Office XP does not contain MODI, MS Office 2003 is required!

Getting Started

Adding the Library

First of all, you need to add the library's reference to your project: Microsoft Office Document Imaging 11.0 Type Library (located in MDIVWCTL.DLL).

Create a Document Instance and Assign an Image File

Supported image formats are TIFF, multi-page TIFF, and BMP.

_MODIDocument = new MODI.Document(); 
_MODIDocument.Create(filename);

Call the OCR Method

The OCR process is started by the MODIDocument.OCR method.

// The MODI call for OCR 
_MODIDocument.OCR(_MODIParameters.Language, 
                  _MODIParameters.WithAutoRotation, 
                  _MODIParameters.WithStraightenImage);

With the Document.OCR call, all the contained pages of the document are processed. You can also call the OCR method for each page separately, by calling the MODIImage.OCR method in the very same way. As you can see, the OCR method has three parameters:

  • Language
  • AutoRotation
  • StraightenImages

The use of these parameters depend on your specific imaging scenario.

Screenshot - modiSettings.JPG

Tracking the OCR Progress

Since the whole recognition process can take a few seconds, you may want to keep an eye on the progress. Therefore, the OnOCRProgress event can be used.

// add event handler for progress visualisation
_MODIDocument.OnOCRProgress += 
  new MODI._IDocumentEvents_OnOCRProgressEventHandler(this.ShowProgress);
public void ShowProgress(int progress, ref bool cancel)
{
    statusBar1.Text = progress.ToString() + "% processed.";
}

The Document Viewer

Together with the MODI document model comes the MODI viewer component AxMODI.AxMiDocView. The viewer is contained in the same library as the document model (MDIVWCTL.DLL). With a single statement, you can assign the document to the viewer. The viewer offers you many operations like selection, pan etc..

axMiDocView1.Document = _MODIDocument;

To make the component available in Visual Studio, just go to the Toolbox Explorer, open the context menu, select Add/Delete Elements.., and choose the COM Controls tab. Then, search for Microsoft Office Document Imaging Viewer 11.0, and enable it.

Processing the Recognition Result

Working on the result structure is pretty straightforward. If you just want to use the full text, you simply need the image's Layout.Text property. As an example for further processing, here is a little statistic method:

private void Statistic()
{    
    // iterating through the document's structure doing some statistics.
    string statistic = "";
    for (int i = 0 ; i < _MODIDocument.Images.Count; i++)
    {
        int numOfCharacters = 0;
        int charactersHeights = 0;
        MODI.Image image = (MODI.Image)_MODIDocument.Images[i];
        MODI.Layout layout = image.Layout;
        // getting the page's words
        for (int j= 0; j< layout.Words.Count; j++)
        {
            MODI.Word word = (MODI.Word) layout.Words[j];
            // getting the word's characters
            for (int k = 0; k < word.Rects.Count; k++)
            {
                MODI.MiRect rect = (MODI.MiRect) word.Rects[k];
                charactersHeights  += rect.Bottom-rect.Top;
                numOfCharacters++;                        
            }
        }
        float avHeight = (float )charactersHeights/numOfCharacters;
        statistic += "Page "+i+ ": Avarage character height is: "+
                         "avHeight.ToString("0.00") +" pixel!"+ "\r\n";
    }
    MessageBox.Show("Document Statistic:\r\n"+statistic);
}

Searching

MODI also offers a full featured built-in search. Since a document may contain several pages, you can use the search method to browse through the pages.

Screenshot - modiSearch.JPG

MODI offers several arguments to customize your search.

// convert our search dialog properties to corresponding MODI arguments
object PageNum = _DialogSearch.Properties.PageNum;
object WordIndex = _DialogSearch.Properties.WordIndex;
object StartAfterIndex = _DialogSearch.Properties.StartAfterIndex;
object Backward = _DialogSearch.Properties.Backward;
bool MatchMinus = _DialogSearch.Properties.MatchMinus;
bool MatchFullHalfWidthForm = _DialogSearch.Properties.MatchFullHalfWidthForm;
bool MatchHiraganaKatakana = _DialogSearch.Properties.MatchHiraganaKatakana;
bool IgnoreSpace =_DialogSearch.Properties.IgnoreSpace;

To use the search function, you need to create an instance of the type MiDocSearchClass, where all search arguments take place:

// initialize MODI search
MODI.MiDocSearchClass search = new MODI.MiDocSearchClass();
search.Initialize(
    _MODIDocument,
    _DialogSearch.Properties.Pattern,
    ref PageNum,
    ref WordIndex,
    ref StartAfterIndex,
    ref Backward,
    MatchMinus,
    MatchFullHalfWidthForm,
    MatchHiraganaKatakana,
    IgnoreSpace);

After the initialization call of the search instance, the process call itself is simple:

MODI.IMiSelectableItem SelectableItem = null;
// the one and only search call
search.Search(null,ref SelectableItem);

You will find the search results in the referenced SelectableItem argument. The MODI search has impressive features, and works very well. Sure, it is restricted to search for plain text. In most real world applications, you will need some kind of fuzzy searching since your text results may be corrupted by single OCR errors. But for a few lines of integration code, it is an impressive functionality.

MODI, Office 2007 and Vista

Good news: Office 2007 and Vista, both support MODI! It's not installed by default, but you can easily add the package via installing options of your Office 2007. You just need to rerun the setup.exe (of your Office installation) again and choose the package as in the screenshot below.

Screenshot - modi_vista.jpg

About Document Processing

OCR is only one step in document processing. To get a more qualified access to your paper based document information, usually a couple steps and techniques are required:

Scanning

Before documents are available as images, they have to be digitalized. This process is called 'scanning.' There are two important standards used for interacting with the scanning hardware: TWAIN and WIA. There are (at least) two good articles in CodeProject on how to use these APIs.

Image Processing

Although the scanning devices are getting better, a couple of methods can be used to increase the image quality. These pre-processing functions include noise reduction and angle correction, for instance.

OCR Itself

As a next step, OCR itself interprets pixel-based images to layout and text elements. OCR can be called the 'highest' bottom up technology, where the system has no or only little knowledge about the business context. Recognizing hand written documents is often called ICR (intelligent Character Recognition).

Document Classification

In most business cases, you have certain target structures you want to fill with the document information. That is called 'Document Classification and Detail Extraction.' For instance, you might want to process invoices, or you have certain table structures to fill. In Document Processing Part II, you can see how this kind of content knowledge can be used.

Beyond

After that, you might have an address database you want to match the document addresses with. Due to 'noisy' environments or disordered information, you need more sophisticated techniques than simple SQL. In the last step, the extracted information is given to the client application (like an ERP backbone) where customized workflow activities are triggered. The sector creates new names for that every couple of months: ECM (Enterprise Content Management), DMS (Document Management System), IDP (Intelligent Document Processing), (DLC) Document Life Cycle.

References

Versions

  • 3 Apr 2007: Added Vista hints
  • 29 Sep 2006: Added search functions
  • 31 May 2005: Added references
  • 15 Apr 2005: Initial version

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)

About the Author

Martin Welker
CEO Axonic Informationssysteme GmbH, Germany
Germany Germany
Member

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
Questionaxmodi referencememberToN.FiER12 Dec '12 - 21:23 
I tried:
1. Delete reference
2. Add the Microsoft Office Document Imaging Viewer Control to the Toolbox. (c:\Program Files\Common Files\microsoft shared\MODI\12.0\MDIVWCTL.DLL)
3. Done
But AxMODI have not added
How to add him? :(
Questionwin 7 and office 2010memberkiquenet.com4 Oct '12 - 4:06 
I have vs 2010, using in win 7 and office 2010.
 
I have not found Microsoft Office Document Imaging 11.0 in COM references tab in VS.
 

I execute the demo but I get error not registeres clas about AxMODI: HRESULT: 0x80040154 (REGDB_E_CLASSNOTREG
 
any suggestions?
kiquenet.com

AnswerRe: win 7 and office 2010memberhedgehoginahaze7 Nov '12 - 21:38 
Check http://support.microsoft.com/kb/982760[^].
 
Regards.
QuestionAttempted to read or write protected memory. This is often an indication that other memory is corrupt.memberVitaliy.NET26 Jul '12 - 1:51 
Form1.cs
 
line - ((System.ComponentModel.ISupportInitialize)(this.axMiDocView1)).EndInit();
 
Attempted to read or write protected memory. This is often an indication that other memory is corrupt. ??
QuestionC# SImple SamplememberZamirF22 Sep '11 - 8:16 
A very simple example can be found at:
http://zamirsblog.blogspot.com/2010/12/ocr-using-ms-office.html[^]
Questionunable to load in win7 64bitmemberMember 41805504 Sep '11 - 3:19 
The Image viewer umable to load in win7 64 bit please suggest suitable solution
AnswerRe: unable to load in win7 64bitmemberhedgehoginahaze7 Nov '12 - 21:36 
Set in project properties target as x86. MODI is 32-bit application.
Questionwindows 7 unable to load Interop.modi.dllmemberMember 418055023 Aug '11 - 4:44 
tif image unable to load in win 7.
 
regards
AnswerRe: windows 7 unable to load Interop.modi.dllmembersefarkas25 Aug '11 - 7:57 
Remove the references to MODI that comes from the ZIP file associated with this article. Then, VS2010 menu Project --> References --> Add Reference and point to the DLLs in the BIN\DEbug folder that comes from the ZIP file. You only need two, the one for AxInterop.MODI and Interop.MODI. You can erase the third one.
GeneralRe: windows 7 unable to load Interop.modi.dllmemberMember 418055026 Aug '11 - 3:48 
Com exception was unhandeled
 
Class not registered (Exception from HRESULT: 0x80040154 (REGDB_E_CLASSNOTREG))
 

The above exception is raised when I add the two dll only i.e. Axinterop.modi and Interop.modi.
GeneralRe: windows 7 unable to load Interop.modi.dllmemberkiquenet.com4 Oct '12 - 4:11 
any solution about it ?
kiquenet.com

QuestionCannot run in Windows 7 and Vista!membereric_klyuen1 Aug '11 - 22:05 
Hi Guy,
 
MODI object & You tools example is excellent to give us idea of sparing alot of time for typing!!
 
However, i find that it can not run in Window 7 or Vista, even i can debug with it on Visual Studio 2005. XP is fine yet it is not a commmon machine in our company.
 
Is it related to DEP? I try to add it into the exception list of running but fail.
Do you have any idea or work around on this?
Cheers
Thank you very much .
 
EricLun
QuestionLoading imagesmemberMember 41805501 Aug '11 - 8:51 
Language: C#
How can I load images to the viewer of this control. Please suggest.
QuestionConverting document to grayscale before OCRmembereljainc21 Jul '11 - 10:07 
Hello,
 
I'm using the MODI component for data extraction. I am able to do the OCR by loading
a TIFF image. However the image is in color and the extraction works better when the
numbers are in grayscale, not color. How can I (using the MODI library) convert the
image to grayscale before the OCR?
 
Thanks
Mike
QuestionHi Martin [modified]membergokuldas7 Jul '11 - 1:39 
I tried your sample using VS 2010 and I am getting error that "Object not yet initialised and can't be used yet", when I call md.OCR() method.
try
{
MODI.Document d = new MODI.Document();
d.Create(@"c:\temp\sometext.tiff");

d.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, false, false);
d.Close();
MODI.Image image = (MODI.Image)d.Images[0];
MODI.Layout layout = image.Layout;
MODI.Word word = (MODI.Word)layout.Words[0];
Console.WriteLine(word.Text);
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());

}
Please advise what may be causing this error.
Regards,
Gokul

modified on Thursday, July 7, 2011 8:35 AM

QuestionZonal Recognitionmemberdevange220 Apr '11 - 15:17 
Hi,
I'm tring to solve this problem:
I have an array of areas (x,y,width,height) associated with an array of textBox by the same index, is it possible to make ocr on each area of the array and store the output text in the relative textbox?
 
Thank's
 
Best regards
 
Davide
GeneralA simple examplememberZamirF30 Dec '10 - 8:49 
For an extremely simple example check my blog at:
http://zamirsblog.blogspot.com/2010/12/ocr-using-ms-office.html[^]
QuestionHow can we perform OCR on specific page out of the multipage tiff image using MODImemberMember 24401751 Nov '10 - 22:36 
Hi, I am having a multipage tiff image, around 31 pages, and I want to perform OCR on this tiff file, page wise. i.e. User opens particular page (say 5th page) and performs OCR on that page only. Anyone please let me know if this is possible using MODI ?
 
Thanks & Regards.
AnswerRe: How can we perform OCR on specific page out of the multipage tiff image using MODImembersefarkas25 Aug '11 - 7:55 
Reprint the page of interest with the Microsoft Document Image Writer selecting only the page(s) you are interested in. Use the file that print driver saves as your source file.
Generalusing MODI OCR in custom application [modified]memberMember 244017526 Oct '10 - 20:19 
Hi,
I am having my application in which I have to do some manual data entry from scanned images. Now I want to automate this using some OCR engine. So I wanted to know whether I can use MODI in my custom application ? Do I need to take care of any licensing issues ?
Waiting for response.
Thanks.

modified on Friday, October 29, 2010 4:13 AM

QuestionHow can I send text to word, using a MODI module?memberHotShot427 Sep '10 - 0:47 
I'm from russia. Sorry for my English.
How can I send text to MS Word, using a MODI module, can't find command.
MS Imaging interface has button send text to word.
GeneralSet X and Y rectangle coordinate locationmembermonkeynote18 Sep '10 - 9:01 
Hi!
 
I would like to ask if how can i set the X and Y rectangular coordinate? Is that possible?
GeneralMODI Requiredmembermmalkar3 Sep '10 - 2:48 
Is it required to have Microsoft Office Document Imaging install? Server might not have MS Office installed. Is'nt it possible to package the MODI within installer? What can we do to handle this problem?
QuestionException HRESULT: 0x80040154 (REGDB_E_CLASSNOTREG)membertirex200929 Jan '10 - 5:55 
Hello!
I have a problem .
throw Exception HRESULT: 0x80040154 (REGDB_E_CLASSNOTREG) in this line
 
((System.ComponentModel.ISupportInitialize)(this.axMiDocView1)).EndInit();
 

I have Windows 7 and Office 2007. Help me pleace.
AnswerRe: Exception HRESULT: 0x80040154 (REGDB_E_CLASSNOTREG)memberjgricci21 Oct '10 - 9:37 
I too am having this issue in Windows 7 w/ Visual Studio 2010 and office 2007 Document image viewer installed...
 
I tried regsvr32 the dll and still no luck.
 
Have you figured this out?
GeneralRe: Exception HRESULT: 0x80040154 (REGDB_E_CLASSNOTREG)memberjgricci21 Oct '10 - 10:10 
Figured out I needed to:
regsv32 "C:\Program Files (x86)\Common Files\microsoft shared\MODI\12.0\MDIVWCTL.DLL"
 
and then set the project to 32-bit only (Ie, configure it to build for x86 instead of ANY CPU) then it works great!
 
Thanks!
QuestionMODI 2003 and 2007memberlmontanez29 Nov '09 - 22:15 
Is it possible avoid the incompatibility problems with MODI 2003 and 2007?
I mean, I am developing a small tool using MODI 2003.
But when I install the tool in a computer with MS 2007, it doesn't work, because the version are differents, how can avoid that, I mean, it works with ms 2003 and 2007
 
thanks a lot
GeneralMODImemberyerroju18 Nov '09 - 18:10 
Can we use MODI.dll without office installed.Is there any way ?If so pleases susggest.
 
Thankyou
GeneralRe: MODImemberwindrago2 Dec '09 - 13:47 
the EULA says that you can't. Legally.
 
amok

GeneralSecurity PatchmemberPatilVL11 Nov '09 - 21:11 
Thanks for the infornmation, security patch (KB973507)that prevents loading com component in vb toolbox.
But if i remove the said patch, and use the MODI component, should the user of the software package should also delete the security patch?
GeneralMS Security patch KB973507 blocks MODI Viever DLL in COM enviromentmemberolaf rappe20 Aug '09 - 0:47 
Hello,
 
we are developing applications using the preview/ocr functionality of MODI in VBA (MSAccess /VB6 enviroment) and have started to redesign these applications using VB dotnet.
Now we were informed by our customers that our application no longer works. We discovered that the MODI viewer control that was placed on a access form generates a runtime error: 'this control doesn't contain an Automation object'. It seemed that the control was no longer registered. But a reregistering with regsvr32 did not help.
When we tested a simple VB6 form with the viewer control on it we got the same result - runtime error.
I discovered that the problem was caused by MS Security Update KB973507 which provides a new atl.dll.
Workstations which received this security patch via autoupdate failed to run our application.
 
The only workaround we found was to uninstall the security update. The problem was immediately solved.
 
First tests in DOTNET endviroment show that the control seems to work there even with Security Update KB973507 installed.
 
So if you see strange behavior of your MODI application maybe you should have to uninstall KB973507 security patch.
 
Regards
 
Olaf Rappe
Visibelle IT Services
www.visibelle.de
GeneralRe: MS Security patch KB973507 blocks MODI Viever DLL in COM enviromentmemberPatilVL11 Nov '09 - 21:14 
Thanks for the infornmation, security patch (KB973507)that prevents loading com component in vb toolbox.
But if i remove the said patch, and use the MODI component, should the user of the software package should also delete the security patch
GeneralRe: MS Security patch KB973507 blocks MODI Viever DLL in COM enviromentmemberolaf rappe22 Nov '09 - 23:03 
The security patch has to be uninstalled on each client on which MODI is intended to run.
For a possible workaround and for the announcement of a hotfix for the security patch see:
http://social.msdn.microsoft.com/Forums/en-US/securelm/thread/4a33bf3e-096d-48bc-b716-b58c011faf58
 
Also note that MODI is deprecated for Office2010 16 and 32 bit and is already missing in Office2010 beta:
http://bhandler.spaces.live.com/Blog/cns!70F64BC910C9F7F3!7040.entry
 
Regards
 
Olaf Rappe
Visibelle IT Services
GeneralMore generic version...memberjarek.lukaszewicz19 Aug '09 - 7:15 
Hi!
 
I'd like to thank Martin for writing such a good article on MODI. There are currently two MS Office versions which provide MODI components and in my organization we use both msoffice versions. So I started to wondering if is it possible to write a generic version of this application which would "find" - using some kind of magic - MODI v11(MS office 2003) or v12 office 2007, without fixing it in project/solution code? What I need to do to get it work? Any help would be appreciated...
 
Regards,
Jarek
AnswerRe: More generic version...memberondrejb26 Aug '09 - 22:25 
This should work:
As the reference is checked at the time it is used, you may create two custom libraries, one referencing v11 and the other referencing v12. These libraries would just provide a wrapper for the OCR functions; the first use of each of the library should be inside a try-catch block to detect whether it works (if not, you will just switch to use the other library).
 
An easier solution would be distributing the MODI v11 library with your app (this works), but there might be an issue with the license (you must insure that the app won't run on a machine without Office).
GeneralDemo DocumentProcessing1 doesn't give consistent resultsmembereljainc7 May '09 - 7:22 
Hello,
 
I have tried the DocumentProcessing1 demo from CodeProject. I have tried with some bitmaps that have the same general information in them. However sometimes when I try to define a region of interest that surrounds the characters, it fails to recognize anything.
 
Is there any setting that can be changed so that the OCR works more reliably. I wanted to post a couple of samples but there is not an option to add attachments to the posting.
 
Thanks
Mike
QuestionError Releasing COM object & RPC_E_SERVERFAULTmemberfoxbat_vv1 Apr '09 - 19:38 
Hi,
 
I am developing an automated application that uses Microsoft Office Document
Imaging (MODI) to perform OCR on a large number of TIFF files.
 
In the code, I have a loop that goes through all the tiff files that need to
be processed. On each run of the loop I call a function (see below) which
creates a new MODI.Document COM object to OCR the TIFF File. At the end of
the function I release the COM object.
 
Occasionally the automated application encounters one of the following issues:
 
1.Sometimes the TIFF file is not released by the OCR process, thus it can’t
be moved or deleted until the application closes.
 
2.When performing OCR on an image object, sometimes a RPC_E_SERVERFAULT
(code: -2147417851) exception is thrown.
 
When the application performs OCR on 60 TIFF files, usually 1 random file
will not be released when Marshal.FinalReleaseComObject is called.
 

How can I modify the code so that the errors mentioned above do not occur?
 
private void SimpleOCR(FileInfo objFileName)
{
try
{
//SetImage
m_objMODIDocument = new MODI.Document();
m_objMODIDocument.Create(objFileName.FullName);
 
try //Perform OCR
{
for (int i = 0; i &lt; m_objMODIDocument.Images.Count; ++i)
{
MODI.IImage objImage = null;
try
{
objImage =
(MODI.IImage)(m_objMODIDocument.Images[i]);
objImage.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH,
true, true);
}
catch
{
LogFile.WriteLine("Cannot OCR page " + (i + 1));
}
finally
{
 
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(objImage);
objImage = null;
}
}
 
LogFile.WriteLine(objFileName.FullName + ", OCR OK! ");
}
catch (Exception ex)
{
LogFile.WriteLine(objFileName.FullName + ", OCR FAIL! EX
= " + ex.Message);
}
 
m_objMODIDocument.Save(); //(This causes the object not to
get relased some times)
System.Threading.Thread.Sleep(1000);
 
m_objMODIDocument.Close(false);
 
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(m_objMODIDocument);
 
for (int i = 0; i &lt; 5; ++i)
{
if (move(objFileName))
break;
else
{
System.Threading.Thread.Sleep(1000); // if cannot
move the document sleep and try again
 
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(m_objMODIDocument);
GC.Collect();
GC.WaitForPendingFinalizers();
}
if (i == 4)
{
LogFile.WriteLine("The file was not released.");
}
}
 
m_objMODIDocument = null;
}
catch (Exception ex)
{
LogFile.WriteLine("The file was skipped. EX: " + ex.Message);
}
 

}
 

public bool move(FileInfo objFileName)
{
try //DeleteMoveFiles(objFileName, true);
{
System.IO.File.Move(objFileName.FullName,
DocScan.Properties.Settings.Default.OCRMoveTifsFolder + "\\" +
objFileName.Name);
LogFile.WriteLine(objFileName.FullName + " was successfully
moved");
return true;
}
catch (Exception ex)
{
LogFile.WriteLine("*** WARNING: Could not Move or Delete
file. (" + ex.Message + ") ");
return false;
}
}
 

Thanks,
 
Vagram
AnswerRe: Error Releasing COM object & RPC_E_SERVERFAULTmemberfoxbat_vv14 Apr '09 - 20:34 
After some further investigation i had found that another component (not mentioned here) was accessing the TIFF files. The controls behavior caused the errors that were described.
GeneralRe: Error Releasing COM object & RPC_E_SERVERFAULTmemberhero8219 Aug '09 - 5:43 
Hi,
 
I have the same issue. Randomly, a RPC_E_SERVERFAULT (code: -2147417851) exception is thrown.
 
I already looked up if another process has a handle on the file, but there is none.
 
Has anyone experienced the same problem? (I use VS2008 and MODI12.0 on Win XP SP3)
 
Many thanks
Hendrik
GeneralRe: Error Releasing COM object & RPC_E_SERVERFAULTmemberondrejb26 Aug '09 - 22:04 
I've solved this by ensuring to properly release all references to COM objects:
 
Dim doc As MODI.Document = Nothing
Dim layout As MODI.ILayout = Nothing
Dim page As MODI.IImage = Nothing
Dim pages As MODI.IImages = Nothing
 
doc = New MODI.Document
doc.Create(TiffFile)
 
pages = doc.Images
For x = 0 To pages.Count - 1
 
   page = DirectCast(pages(x), MODI.IImage)
   page.OCR()
 
   layout = page.Layout
   ' work with the layout.Text property
   System.Runtime.InteropServices.Marshal.FinalReleaseComObject(layout)
 
   System.Runtime.InteropServices.Marshal.FinalReleaseComObject(page)
 
Next
 
doc.Close()
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(pages)
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(doc)

Generalcharacter recognitionmemberchandupatel5 Feb '09 - 23:34 
hello...
 
We would like to carry on our project regarding the character recognition. Could anybody get me the concerned details of techniques used for character recognition...plz.....
QuestionMODI 2007 better than MODI 2003 for OCR??memberChadFolden129 Oct '08 - 11:43 
Anyone have any insight as to if MODI 2007 is any better than MODI 2003 for OCR? From what I can tell it almost seems like they are the same, but curious if anyone has any thoughts as to whether or not it's worth upgrading or not
AnswerRe: MODI 2007 better than MODI 2003 for OCR??memberSike Mullivan20 Feb '09 - 14:21 
MODI 2007 is the exact same thing as MODI 2003 with the exception of the version.
QuestionIs it possible to OCR read only a zone of the picture??memberandredani16 Oct '08 - 7:30 
Hi all!
Is it possible to only read for example in the right corner of the picture?
I´m trying to make a pattern of words. Or if MODI can tell me where he founded that word in pixel height and windht of the picture?
 
Waiting for awnser!!
 
Take care // André
AnswerRe: Is it possible to OCR read only a zone of the picture??memberSike Mullivan20 Feb '09 - 14:20 
Yes, MODI provides you with a rectangle object that describes it's location.
GeneralRe: Is it possible to OCR read only a zone of the picture??membermarionny27 May '10 - 3:32 
But it is possible to draw multiple rectangles like a template. I want only to ocr some areas of the page. Can I mark those with rectangles for example and then ocr on that areas? I want then to vocaly read with sapi those areas of the picture.
GeneralRe: Is it possible to OCR read only a zone of the picture??memberSike Mullivan27 May '10 - 3:54 
No... MODI will OCR the whole image. The only way to get around that is to crop the image and then run MODI against the cropped image.
GeneralRe: Is it possible to OCR read only a zone of the picture?? [modified]membermarionny27 May '10 - 7:00 
And can I get more portions of the picture assambled in one picture with blank parts and image parts? Or is there another way doing a document template? Or can I make white portions in the picture (and remain only what i need to ocr)?

modified on Thursday, May 27, 2010 3:13 PM

QuestionCan I deal with the text OCRed?membershkirin10 Sep '08 - 16:39 
Excuse me, can I deal with the OCRed text one by one according to this program.
Actually I want translate each piece of word recognized from English to Chinese, that is one part of my task
Can you help me with this problem?
Thank you!
 
It's me, Frank
modified on Wednesday, September 10, 2008 10:51 PM

QuestionHow do i get all the OCR reading to a txt file, without marking the picture?memberandredani20 Jul '08 - 21:26 
My question are after i started search for OCR at a picture, i have to select a text in the picture to copy it to clipboard.
I want it to be atomatic, by using streamwriter right after it has done the OCR reading, it will save all the text it founds in the picture in a file.
 
Please help me with this!!
 
//Thanks!!

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130516.1 | Last Updated 26 Oct 2007
Article Copyright 2005 by Martin Welker
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid