How to read and write text from an image without using MODI and OCR

Question

1.00/5 (1 vote)

See more:

Hi,

I have done a code for extracting the text from an image and writing in a text file using MODI and OCR. But the problem is, i need to install microsoft office 2007 with adding MDIVWCTL.DLL. I dont want to use OCR and MODI. Please help me to extract text lines from a particular image directly without using OCR method and write it into a .txt file.

Code below :

public static void CheckFileType(string directoryPath)
{

IEnumerator files = Directory.GetFiles(directoryPath).GetEnumerator();
while (files.MoveNext())
{

string fileExtension = Path.GetExtension(Convert.ToString(files.Current));

string fileName =
Convert.ToString(files.Current).Replace(fileExtension,string.Empty);

if (fileExtension == ".jpg" || fileExtension == ".JPG")
{
try
{

MODI.Document md = new MODI.Document();
md.Create(Convert.ToString(files.Current));
md.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);
MODI.Image image = (MODI.Image)md.Images[0];

//create text file with the same Image file name
FileStream createFile =
new FileStream(fileName + ".txt", FileMode.CreateNew);
//save the image text in the text file
StreamWriter writeFile = new StreamWriter(createFile);
writeFile.Write(image.Layout.Text);
writeFile.Close();
}
catch (Exception exc)
{
//uncomment the below code to see the expected errors
//MessageBox.Show(exc.Message,
//"OCR Exception",
//MessageBoxButtons.OK, MessageBoxIcon.Information);
}
}
}
}

Posted 3-Apr-15 0:44am

Kamal Kannan

Add a Solution

Comments

Maciej Los 3-Apr-15 6:58am

What is MDIVWCTL.DLL? It uses MODI. So, how do you want to use it without MODI?

Sascha Lefèvre 3-Apr-15 7:03am

Look for an alternative OCR solution.

Kamal Kannan 3-Apr-15 7:05am

After installing microsoft office 2007, iam getting MODI. Then only i will get the DLL (MDIVWCTL.DLL). I dont want to install any thing in my client machine. There should be an application it should read the text from image.

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

OriginalGriff · Answer 1 · 2015-04-03T00:56:00

You cannot extract text as text from an image without using OCR in some form: OCR looks at the pixels of the image, and determines which ones are parts of letters then works out which letters, and finally words. Without some form of OCR all you have is individual pixels with no textual meaning at all.