Parse, understand and demystify Enhanced Meta Files (EMF) with C#





5.00/5 (8 votes)
A simple approach to inspect Enhanced Meta File (EMF) content and find/fix inconsistencies. Easy to adopt to your specific purpose.
Download EmfProcessor.zip Visual Studio C# solution, including source and debug binary
Introduction
I was searching for an inconsistency in EMF display between two files. After a lot of testing I found the surprising reason behind the obscure behaviour - and it was very easy to fix.
With this article I want to share my findings regarding the EMF/EMF+ format and provide an application that can be a very good starting point for your EMF inspection too.
Background
My starting point
My problem was "missing text" in one of two, very similar EMF files. Both files have been generated with Windows
drawing commands.
The next two images illustrate the problem - some text is missing within the second image. The images are displayed by Windows Paint
.
An import of the second EMF file to Microsoft PowerPoint
shows the same result - missing texts.
The next two images illustrate the same two files - this time displayed by Microsoft Internet Explorer 11
.
An import of the second EMF file to Microsoft PowerPoint
and subsequent un-grouping of the EMF drawing elements shows the same result - visible texts.
First finding
Windows Paint
just plays EMF/EMF+ files with the GDI+.Microsoft Internet Explorer 11
does something different.
In the course of many further experiments, the different display quality leads to the assumption that
- GDI+ ignores the actual EMF drawing commands and plays the EMF+ comment records while
- IE ignores the EMF + comment records and plays the actual EMF drawing commands.
The assumption is mainly based on the display quality of text - specifically the missing kerning, if displayed by Microsoft Internet Explorer 11
.
Using the code
EMF/EMF+ is a binary representation of a Windows
drawing command sequence. Every drawing command is represented by a record.
Microsoft
provides a good documetation of EMF and EMF+. And there are alternative sources of information too.
The sample application is a Windows.Forms .NET application, that implements only a part of the EMF specification.
- But this was enough to find/fix the inconsistencies i was searching for.
- And it is easy to expand.
The next two images illustrate the sample application in action.
The sample application can
- read an EMF/EMF+ file (via menu: File | Open),
- present the drawing command sequence (EMF/EMF+ records) on tab Data and
- draw the image on tab Image.
The sample application consists of seven source code files:
- Program.cs - the application initialization code (nothing of interest - created by the wizard)
- Form1.resx - the GUI design (nothing of interest - created by the wizard and maintained by the forms designer)
- Form1.Designer.cs - the GUI initialization code (nothing of interest - maintained by the forms designer)
- Form1.cs - the GUI callbacks and subordinated methods
- EmfStructures.cs - ready-to-use EMF declarations
- EmfPlusStructures.cs - ready-to-use EMF+ declarations
- EmfByteConverters.cs - helper class similar to
System.BitConverter
Form1.cs
The sample application does not use the .NET System.Drawing.Graphics.EnumerateMetafileProc
delegate, because it doesn't provide all the information that are required to demystify the EMF/EMF+ format. Instead the sample application recreates the Win32
API int CALLBACK EnumMetaFileProc(...)
.
To realize this, a binary reader is used and ParseMetaCB(...)
is called for every record (see menu item callback openToolStripMenuItem_Click(...)
).
...
using (FileStream readerFS = File.Open(ofd.FileName, FileMode.Open))
{
using (BinaryReader br = new BinaryReader(readerFS))
{
...
br.ReadBytes(...);
...
ParseMetaCB(...);
...
}
}
...
The drawing command sequence (EMF/EMF+ records) is written to a StringBuilder
and displayed by a TextBox
on tab Data as well as drawn as an image and displayed by a PictureBox
on tab Image.
Beside this the sample application also includes the possibility to write the drawing command sequence (EMF/EMF+ records) unmodified or modified to another file.
...
using (FileStream writerFS = File.Open(ofd.FileName.Replace(".", "_."), FileMode.Create))
{
using (BinaryWriter bw = new BinaryWriter(writerFS))
{
...
bw.Write(...);
...
}
}
...
There are three ways to modify the drawing command sequence (EMF/EMF+ records) to write:
1. Skip a complete record. This is realized by:
List<int> _recordsToSuppress;
2. Set the byte[]
, returned from ParseMetaCB(...)
, to null
.
byte[] response = ParseMetaCB(recordCount, type, pos, length, /* is header */ 1, recordData, log);
if (response != null && (recordData.Length == 0 || response.Length > 0))
...
3. Manipulate the byte[]
, returned from ParseMetaCB(...)
within the method. E.g:
BitConverterWithLimitCheck.SetBytesFromUInt32(data,
emfPlusRecordStart + /* comment identifier length */ 4 + emfPlusDrawString.Index_BrushIDOrColor,
(UInt32)Color.FromArgb(0xFF, Color.FromArgb((int)plusDrawString.BrushIDOrColor)).ToArgb());
This means the ParseMetaCB(...)
method realizes parsing/interpretation of the EMF/EMF+ records (written to a StringBuilder
) as well as an optional manipulation of the EMF/EMF+ records.
/// <summary>Parse EMF binary record.</summary>
/// <param name="recordCount">The count of the EMF record currently to parse.</param>
/// <param name="recordType">The type of EMF record currently to parse.</param>
/// <param name="filePos">The file position. Informative only! Do not use as indexer.</param>
/// <param name="length">The length in Byte of the EMF record currently to parse.</param>
/// <param name="flags">Flags to determine different parsing behavior for the same record type.</param>
/// <param name="data">The <see cref="Byte"/> array of the EMF record currently to parse.</param>
/// <param name="log">The <see cref="StringBuilder"/> to write log messages to.</param>
/// <returns>Returns <see cref="Byte"/> array to accept for the output stream.</returns>
private byte[] ParseMetaCB(int recordCount, EmfPlusRecordType recordType, Int64 filePos,
Int32 length, Int32 flags, byte[] data, StringBuilder log)
{
...
}
The ParseMetaCB(...)
method currently handles this excerpt of EMF comment record types:
EmfPlusRecordType.EmfHeader
EmfPlusRecordType.EmfEof
EmfPlusRecordType.EmfSetBkMode
EmfPlusRecordType.EmfSetPolyFillMode
EmfPlusRecordType.EmfSetTextAlign
EmfPlusRecordType.EmfSetTextColor
EmfPlusRecordType.EmfSaveDC
EmfPlusRecordType.EmfRestoreDC
EmfPlusRecordType.EmfModifyWorldTransform
EmfPlusRecordType.EmfSelectObject
EmfPlusRecordType.EmfCreateBrushIndirect
EmfPlusRecordType.EmfDeleteObject
EmfPlusRecordType.EmfSetMiterLimit
EmfPlusRecordType.EmfGdiComment
EmfPlusRecordType.EmfBitBlt
EmfPlusRecordType.EmfExtCreateFontIndirect
EmfPlusRecordType.EmfExtTextOutW
EmfPlusRecordType.EmfPolyBezier16
EmfPlusRecordType.EmfPolygon16
EmfPlusRecordType.EmfExtCreatePen
EmfPlusRecordType.EmfSetIcmMode
EmfPlusRecordType.EmfMax
After a lot of testing I found out, that EmfPlusRecordType.EmfGdiComment
is of special interest. This record type can be divided into these gdi comment record sub-types:
- EmfPlus
- EmfSpool
- EmfPublic
and the sub-type gdi comment record sub-type EmfPlus can can itself contain the a lot of EMF+ comment records. Currently the ParseMetaCB(...)
method handles:
RecordType.EmfPlusHeader
RecordType.EmfPlusObject
RecordType.EmfPlusDrawRects
RecordType.EmfPlusFillRects
RecordType.EmfPlusDrawLines
RecordType.EmfPlusFillPolygon
RecordType.EmfPlusDrawEllipse
RecordType.EmfPlusFillEllipse
RecordType.EmfPlusDrawString
RecordType.EmfPlusEndOfFile
RecordType.EmfPlusTranslateWorldTransform
RecordType.EmfPlusScaleWorldTransform
RecordType.EmfPlusSetClipRect
Second finding
If GDI+ plays an EMF/EMF+ file, a present RecordType.EmfPlusHeader
record forces GDI+ to ignore the actual EMF drawing commands and to play the EMF+ comment records. No matter whether every actual EMF drawing command has a related EMF+ comment record.
If GDI+ plays an EMF/EMF+ file, an absent RecordType.EmfPlusHeader
record forces GDI+ to play the actual EMF drawing commands and to ignore the EMF+ comment records. No matter whether EMF+ comment records are available or not.
A decoded drawing command sequence sample
Here the first ~60 lines of decoded drawing commands.
Horizontal-Resolution = 72,04874
Vertical-Resolution = 72,24888
Frame-Width = 24514,25
Frame-Height = 12484,16
Bounds-Width = 695pix
Bounds-Height = 355pix
■ 000-EmfHeader: At=0, TotalLength=108Bytes, DataLength=100Bytes
Bounds: Lft=0, Top=0, Rgt=695, Btm=354
Frame: Lft=0, Top=0, Rgt=24479, Btm=12449
Device: Wdt=1024, Hgh=768
Millimeters: Wdt=361, Hgh=270
Version: Major=1, Minor=0
◙ 001-EmfGdiComment: At=108, TotalLength=44Bytes, DataLength=36Bytes
► EmfPlusCommentIdentifier: EMF+
● 00 EmfPlusCommentType: EmfPlusHeader
Size: RecordSize=28, DataSize=16
Resolution: DpiX=96, DpiY=96
Meta-data: Signature=DBC01, GraphicsVersion=GraphicsVersion1_1
◙ 002-EmfGdiComment: At=152, TotalLength=112Bytes, DataLength=104Bytes
► EmfPlusCommentIdentifier: EMF+
● 00 EmfPlusCommentType: EmfPlusTranslateWorldTransform
Size: RecordSize=20, DataSize=8
Transform: Dx=100, Dy=100
● 01 EmfPlusCommentType: EmfPlusScaleWorldTransform
Size: RecordSize=20, DataSize=8
Transform: Sx=0,2834646, Sy=0,2834646
● 02 EmfPlusCommentType: EmfPlusSetClipRect
Size: RecordSize=28, DataSize=16
Clip: X=-100, Y=-100, Wdt=2451, Hgt=1248
● 03 EmfPlusCommentType: EmfPlusFillRects
Size: RecordSize=28, DataSize=16
Color: #[FF]FFFFFF
Count: 1
Rectangle: X=-100, Y=-100, Wdt=2451, Hgt=1248
■ 003-EmfSaveDC: At=264, TotalLength=8Bytes, DataLength=0Bytes
■ 004-[EmfSetIcmMode]: At=272, TotalLength=12Bytes, DataLength=4Bytes
■ 005-EmfModifyWorldTransform: At=284, TotalLength=36Bytes, DataLength=28Bytes
Matrix: M11=0,0625, M12=0, M21=0, M22=0,0625, Dx=0, Dy=0
■ 006-EmfCreateBrushIndirect: At=320, TotalLength=24Bytes, DataLength=16Bytes
IhBrush: 1
LogBrush: Style=BS_SOLID, Color=#[00]FFFFFF, Hatch=HS_NONE_BECAUSE_SOLID
■ 007-EmfSelectObject: At=344, TotalLength=12Bytes, DataLength=4Bytes
Handle: 1 LogBrush: Style=BS_SOLID, Hatch=HS_NONE_BECAUSE_SOLID, Color=#[00]FFFFFF
■ 008-EmfSelectObject: At=356, TotalLength=12Bytes, DataLength=4Bytes
Stock object: NULL_PEN
■ 009-EmfPolygon16: At=368, TotalLength=48Bytes, DataLength=40Bytes
Bounds: Lft=0, Top=0, Rgt=695, Btm=354
Count: 5
Point0: X=0, Y=0
Point1: X=0, Y=5661
Point2: X=11117, Y=5661
Point3: X=11117, Y=0
Point4: X=0, Y=0
...
My special interest is the text presentation.
- Record 223 contains the EMF+ comment that is used by GDI+ to draw the text.
- Records 224 - 236 contain the actual EMF drawing commands that are ignored by GDI+.
◙ 223-EmfGdiComment: At=12012, TotalLength=244Bytes, DataLength=236Bytes
► EmfPlusCommentIdentifier: EMF+
● 00 EmfPlusCommentType: EmfPlusObject
Size: RecordSize=48, DataSize=36
Identity: Continued=False, ObjectID=17
- PlusObjectType: ObjectTypeFont/not serialized
FamilyName: ARIAL
EmSize: 35
UnitType: UnitTypeWorld
FontStyle: Regular
New CommentHandle: 17
● 01 EmfPlusCommentType: EmfPlusObject
Size: RecordSize=72, DataSize=60
Identity: Continued=False, ObjectID=18
- PlusObjectType: ObjectTypeStringFormat/not serialized
TabStopCount: 0
RangeCount: 0
StringAlignment: Center
LineAlign: Near
New CommentHandle: 18
● 02 EmfPlusCommentType: EmfPlusDrawString
Size: RecordSize=108, DataSize=96
Format: Color=#[00]000000, FormatID='18' (StringFormat: StringAlignment=Center, LineAlign=Near)
String: CharsN=33, Text='Produktentstehung unterstüt-¶zen'
Rectangle: X=1389,763, Y=575,8774, Wdt=500,4742, Hgt=89,24519'
■ 224-EmfSetBkMode: At=12256, TotalLength=12Bytes, DataLength=4Bytes
BkMode: TRANSPARENT
■ 225-EmfSetTextColor: At=12268, TotalLength=12Bytes, DataLength=4Bytes
Color: #[00]000000
■ 226-EmfSetTextAlign: At=12280, TotalLength=12Bytes, DataLength=4Bytes
TextAlign: TA_BASELINE
■ 227-EmfDeleteObject: At=12292, TotalLength=12Bytes, DataLength=4Bytes
Object: Handle=2, Type=LogFont: Name=Arial/Height=-9
Remaining object: Handle=1, Type=LogBrush: Style=BS_SOLID, Hatch=HS_NONE_BECAUSE_SOLID, Color=#[00]FFFFFF
■ 228-EmfExtCreateFontIndirect: At=12304/TotalLength=368Bytes/DataLength=360Bytes
Log font: Height=-10, Width=0, Escapement=0,
Orientation=0, Weight=400, Italic=0,
Underline=0, StrikeOut=0,
CharSet=0, OutPrecision=7,
ClipPrecision=0, Quality=0,
PitchAndFamily=0, Face-Name=Arial
■ 229-EmfSelectObject: At=12672, TotalLength=12Bytes, DataLength=4Bytes
Handle: 2 LogFont: Name=Arial/Height=-10
■ 230-EmfExtTextOutW: At=12684, TotalLength=244Bytes, DataLength=236Bytes
Bounds: Lft=429, Top=190, Rgt=556, Btm=202
GraphicsMode: GM_COMPATIBLE
Scale: Ex=31,25, Ey=31,25
Reference: X=429, Y=13107200
ClipBounds: Lft=0, Top=0, Rgt=-1, Btm=-1
Text: 'Produktentstehung unterstüt-'
■ 231-EmfSelectObject: At=12928, TotalLength=12Bytes, DataLength=4Bytes
Stock object: SYSTEM_FONT
■ 232-EmfSetBkMode: At=12940, TotalLength=12Bytes, DataLength=4Bytes
BkMode: TRANSPARENT
■ 233-EmfSetTextColor: At=12952, TotalLength=12Bytes, DataLength=4Bytes
Color: #[00]000000
■ 234-EmfSetTextAlign: At=12964, TotalLength=12Bytes, DataLength=4Bytes
TextAlign: TA_BASELINE
■ 235-EmfSelectObject: At=12976, TotalLength=12Bytes, DataLength=4Bytes
Handle: 2 LogFont: Name=Arial/Height=-10
■ 236-EmfExtTextOutW: At=12988, TotalLength=96Bytes, DataLength=88Bytes
Bounds: Lft=485, Top=202, Rgt=500, Btm=214
GraphicsMode: GM_COMPATIBLE
Scale: Ex=31,25, Ey=31,25
Reference: X=485, Y=13893632
ClipBounds: Lft=0, Top=0, Rgt=-1, Btm=-1
Text: 'zen'
Third finding
The text is drawn with a solid brush of color #[00]000000
. (EmfGdiComment
/EmfPlusDrawString
as well as EmfSetTextColor
).
The actual EMF record of type EmfSetTextColor
defines the color as a ColorRef Object
, and online documentation for ColorRef Object
says:
But the EMF+ comment record EmfPlusDrawString
defines a BrushID
, and online documentation for BrushID
says:
Consequently a color of #[00]000000
is interpreted as
- BLACK by
EmfSetTextColor
(since alpha channel is ignored) and - and TRANSPARENT BLACK by
EmfPlusDrawString
(since alpha channel is evaluated.
To solve my problem I just have to ensure that colors are opaque.
Points of Interest
It takes much patience to work the way up to the crux of the matter. But it's possible to investigate all details, needed for a solution.
History
Initial version from 27. March 2019.