Click here to Skip to main content
15,895,798 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hi,

I have done a code for extracting the text from an image and writing in a text file using MODI and OCR. But the problem is, i need to install microsoft office 2007 with adding MDIVWCTL.DLL. I dont want to use OCR and MODI. Please help me to extract text lines from a particular image directly without using OCR method and write it into a .txt file.

Code below :

public static void CheckFileType(string directoryPath)
{

IEnumerator files = Directory.GetFiles(directoryPath).GetEnumerator();
while (files.MoveNext())
{

string fileExtension = Path.GetExtension(Convert.ToString(files.Current));


string fileName =
Convert.ToString(files.Current).Replace(fileExtension,string.Empty);


if (fileExtension == ".jpg" || fileExtension == ".JPG")
{
try
{

MODI.Document md = new MODI.Document();
md.Create(Convert.ToString(files.Current));
md.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);
MODI.Image image = (MODI.Image)md.Images[0];

//create text file with the same Image file name
FileStream createFile =
new FileStream(fileName + ".txt", FileMode.CreateNew);
//save the image text in the text file
StreamWriter writeFile = new StreamWriter(createFile);
writeFile.Write(image.Layout.Text);
writeFile.Close();
}
catch (Exception exc)
{
//uncomment the below code to see the expected errors
//MessageBox.Show(exc.Message,
//"OCR Exception",
//MessageBoxButtons.OK, MessageBoxIcon.Information);
}
}
}
}
Posted
Comments
Maciej Los 3-Apr-15 6:58am    
What is MDIVWCTL.DLL? It uses MODI. So, how do you want to use it without MODI?
Sascha Lefèvre 3-Apr-15 7:03am    
Look for an alternative OCR solution.
Kamal Kannan 3-Apr-15 7:05am    
After installing microsoft office 2007, iam getting MODI. Then only i will get the DLL (MDIVWCTL.DLL). I dont want to install any thing in my client machine. There should be an application it should read the text from image.

1 solution

You cannot extract text as text from an image without using OCR in some form: OCR looks at the pixels of the image, and determines which ones are parts of letters then works out which letters, and finally words. Without some form of OCR all you have is individual pixels with no textual meaning at all.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900