Recently, I am working with a project which inter-operates COM interfaces provided by the Microsoft Office Document Imaging (MODI) component. (Yes, I know that it is deprecating, however, nothing can substitute it at this moment.)
Several articles on CodeProject have discussed about that component (a good one of them is this,
OCR with Microsoft® Office[
^]), yet gaps are left between simple introductions and practical use.
To make MODI work, we have to accept that MODI only processes TIFF images, thus the image has to be converted to a temporary TIFF file and processed by the On Screen Recognition (OCR) engine of MODI. Afterwards, the temporary TIFF file should be removed from the hard drive.
The problem in the cleanup process can cost you some time. And that is what this article is talking about. The following code demonstrates the situation about the problem, which is very common around programmers who work with MODI or other COM objects for the first time.
MODI.Document doc = new MODI.Document(); doc.Create("temp.tif");
doc.OCR(MODI.MiLANGUAGES.miLANG_SYSDEFAULT, true, true);
MODI.Image image = (MODI.Image)doc.Images[0]; MODI.Layout layout = image.Layout;
doc.Close(false);
Marshal.FinalReleaseComObject (doc);
doc = null;
GC.Collect();
File.Delete ("temp.tif");
The above code instantiates a COM object
doc, works with the COM object's properties and finally tries to release the instantiated COM object by using
Marshal.FinalReleaseComObject to release the MODI instance (
doc), setting the MODI instance to
null and even calling
GC.Collect, hoping the resource consumed by it can be fully released.
However, an exception will still surely show up, from the last line of code, telling you "
temp.tif" is in used by "another process" and cannot be deleted. Obviously, no other process is working with that darned file.
The problem is that the two objects (
image and
layout) of type
MODI.Image and
MODI.Layout, which appear to be properties of the
doc object though, are actually COM objects as well. Simply releasing the instantiated COM object
doc is not enough. We still have to release other COM objects obtained from its property accessors. So, the answer to solve the file engagement issue is to release those two objects gained from property access of the COM object
doc, like the following code shows.
doc.Close(false);
Marshal.FinalReleaseComObject (doc);
Marshal.FinalReleaseComObject (image);
Marshal.FinalReleaseComObject (layout);
doc = null;
File.Delete ("temp.tif");
After that, the file handle to "
temp.tif" assigned by MODI will be released and the
File.Delete method will run happily without complaining that the file is occupied.
In production code, the COM object releasing code may be placed in the
finally section of a
try...catch...finally block to ensure that the resource is fully released.
A side note:
If you are used to chained property access in .NET, such as
obj.PropertyA.PropertyB. You'd better avoid doing so when you are inter-operating with COM. That's because the intermediate object
obj.PropertyA is "hidden" as an anonymous reference to a COM object and it might not get a chance to be released. The workaround is explicitly declaring a variable to hold the reference to the intermediate object and using
Mashal.FinalReleaseComObject onto that reference when the object is no longer used.
P.S. I am also new to COM inter-operation in .NET. Please correct me if anything above is wrong.
Chinese Poetry Lover.
Programmer.