<!-- Download Links -->
<!-- Article image -->
<!-- Add the rest of your HTML here -->
Part of my current project requires the ability to edit custom HTML documents. I'm using MSHTML (the core component of Microsoft Internet Explorer) in edit mode via
. Whilst it has it's problems MSHTML has a couple of overwhelming advantages. It's free and it can be assumed to be present on almost every Windows computer you'll ever encounter. It's also not terribly difficult to work with.
The documents I'm working with contain images and
LABEL controls, all absolutely positioned. The act of editing consists pretty much of replacing boilerplate text and images and moving them around on the screen. A nice to have is snap-to-grid so everything can be easily lined up, and a visual representation of the grid.
A search on MSDN using 'snap to grid' located a match for MSHTML in code downloads (URL not included because they change) and a sample file called
which is a self extracting executable. The sample includes the source code for an MSHTML host which implements exactly the functionality I wanted. The sample is, however, a non MFC c++ application and the stuff of interest is wrapped up in a bunch of ATL classes.
Because my application is an MFC SDI application using the document/view architecture I decided to reimplement their solution using MFC. I chose not to try and use their ATL classes as presented because of some requirements imposed by the MFC document/view architecture. This didn't, however, stop me nicking some of their code :)
The functionality I wanted to implement falls into two pieces. The first is snap-to-grid. This basically requires the ability to intercept attempts to move or resize an element on the screen, receive the coordinates and modify those coordinates to enforce 'snap' granularity. The second piece is the ability to draw a grid on the MSHTML display surface.
Let's consider each of these pieces in turn.
MSHTML introduced the
COM interface in version 5.5 specifically to support snap-to-grid. You won't find an implementation of this interface in the standard libraries that ship with the Platform SDK - it's one you're expected to implement. In addition to the standard 3 methods in
(I won't mention the standard 3 methods anymore) it has one method,
which MSHTML calls whenever you try to move or resize anything in the document you're currently editing. The parameters are an
interface for the selected element, the new rectangle (screen coordinates) for the element and a parameter specifying which drag handle is being used.
The element interface is useful because you can use it to query HTML attributes on the selected object. For example, you might want to be able to specify that a particular object is locked in place. You can set an attribute on the object specifying this. The
SnapRect method could query the attribute and force the element back to it's original location and size if it was set. I use this functionality in my application but the code isn't presented here because I don't want to get bogged down in a bunch of support code.
The new rectangle is a pointer to a
RECT containing the new coordinates of the object. If you change the coordinates inside your
SnapRect() method the object moves to match your changes.
The drag handle is used to decide which, if any, coordinates in the
RECT should be altered to force 'snap-to-grid'.
I said above that you are expected to implement
IHTMLEditHost which begs the question of how MSHTML gets the interface you've implemented. It gets the interface in a two step process. The first step is to request an
IServiceProvider interface on the host (your application). If it gets that interface it then requests an
IHTMLEditHost interface by calling
Telling MSHTML to use your IHTMLEditHost implementation
Glossing over a lot of detail, the
classes implement a custom OLE control site. Both classes host MSHTML and provide a 'parent' interface that MSHTML can use to query for interfaces of interest. The stock class that handles the custom OLE control site is called
. The class only implements the
interface which MSHTML uses to determine whether it should display Scollbars and suchlike. (See my article here for some discussion of
We can't derive a new class from
CHtmlControlSite for two reasons. The first is that the class definition isn't in a header file, it's in
viewhtml.cpp. More importantly, if we try to derive a class from
CHtmlControlSite the MFC COM Interface Macros[^] will bite us in the bum. Our only recourse is to reimplement the class, which we do as class
In the following code I'm not presenting the entirety of the class definition. We're going to take it piece by piece. The entire class definition is, naturally, included in the download. Our replacement
CHtmlEditControlSite class starts out looking like this.
class CHTMLEditControlSite : public COleControlSite
CHtmlView *GetView() const;
STDMETHOD(ShowContextMenu)(DWORD, LPPOINT, LPUNKNOWN, LPDISPATCH);
STDMETHOD(ShowUI)(DWORD, LPOLEINPLACEACTIVEOBJECT, LPOLECOMMANDTARGET,
STDMETHOD(ResizeBorder)(LPCRECT, LPOLEINPLACEUIWINDOW, BOOL);
STDMETHOD(TranslateAccelerator)(LPMSG, const GUID*, DWORD);
STDMETHOD(GetOptionKeyPath)(OLECHAR **, DWORD);
STDMETHOD(TranslateUrl)(DWORD, OLECHAR*, OLECHAR **);
STDMETHOD(FilterDataObject)(LPDATAOBJECT , LPDATAOBJECT*);
STDMETHOD(QueryService)(REFGUID, REFIID, void **);
STDMETHOD(SnapRect)(IHTMLElement *pIElement, RECT *prcNew,
stuff is a literal copy from the MFC implementation of
. It pretty much delegates everything to virtual functions on the view class which is derived directly or indirectly from
ServiceProvider is our implementation of the
IServiceProvider interface. Recall that MSHTML calls this interface asking for an
IHTMLEditHost interface. It won't mind in the least if an attempt to get an
IServiceProvider interface or an
IHTMLEditHost interface fails but if the attempt to get an
IHTMLEditHost interface succeeds MSHTML will call
IHTMLEditHost::SnapRect() as appropriate.
Our implementation of
ServiceProvider::QueryService() looks like this.
STDMETHODIMP CHTMLEditControlSite::XServiceProvider::QueryService(REFGUID guidService,
HRESULT hr = E_NOINTERFACE;
*ppObj = NULL;
if (guidService == SID_SHTMLEditHost && riid == IID_IHTMLEditHost)
*ppObj = (void **) &pThis->m_xHTMLEditHost;
hr = S_OK;
This checks if the service being requested is the MSHTML
interface. If so it returns a pointer to our
declaration was shown above. The constructor
initialises snapping to 8 pixel boundaries. The real guts of the interface is in our implementation of
which looks like this.
IHTMLElement * ,
RECT * prcNew,
if (GetAsyncKeyState(VK_CONTROL) & 0x10000000)
LONG lWidth = prcNew->right - prcNew->left;
LONG lHeight = prcNew->bottom - prcNew->top;
prcNew->top = ((prcNew->top + (m_iSnap / 2)) / m_iSnap) * m_iSnap;
prcNew->left = ((prcNew->left + (m_iSnap / 2)) / m_iSnap) * m_iSnap;
prcNew->bottom = prcNew->top + lHeight;
prcNew->right = prcNew->left + lWidth;
Which does the appropriate arithmetic to force the
rectangle onto snap boundaries depending on which resize handle was selected. The
GetAsyncKeyState(VK_CONTROL) & 0x10000000
tests to see if the control key (either one of them) is down. If so it exits immmediately, allowing the user to override our snap-to-grid functionality by holding down the control key as they drag the object around on the drawing surface.
Drawing the grid
This is a little more difficult. MSHTML doesn't arbitrarily ask us for an interface this time around, instead we have to register an element behavio(u)r with MSHTML at the appropriate time and then request or supply the necessary interfaces. The appropriate time is, of course, when our document has been loaded.
Of course, whilst you can grab a device context handle to any MSHTML window and paint on it, what you really want is to paint the grid before MSHTML renders the rest of it's display. Does the end user really want your gridlines drawn on top of their content? Achieving the correct painting order requires a bit of dancing with MSHTML.
were added to Microsoft Internet Explorer version 5. They provide a 'hook' which can be used to modify the way a particular element behaves within an HTML page. The behavior can be many things but the one we're interested in is how the element is rendered. You can specify an element behavior on any element within the page as long as that element can return an
interface. We register an element behavior by creating an object that implements the
interface and passing its address to the
function. MSHTML then, as it renders the document, calls the behavior factory passing a bunch of parameters specifying exactly which
behavior it wants for this particlular element, as specified by that element, and only for elements that have behaviors attached in HTML or those which have had
called on them. The element behavior factory then returns an
interface to an object that implements the behavior.
Given that we want to draw a grid on the background of the entire document an obvious starting place is with the document interface itself,
IHTMLDocument2. Unfortunately this doesn't work because the document itself isn't an element, it's an element container. We need to go down one level and get an interface to the body of the document. Even though it's the body it's still an
IHTMLElement2 interface, meaning we could go even deeper and draw a grid on a single element on the page if we wanted to.
Once we've got a pointer to the body element of the document we add our behavior factory to it. Sometime later MSHTML calls our behavior factory requesting an
IElementBehavior interface. We dutifully return one. MSHTML then calls the
IElementBehavior::Init() function on our element behavior object, passing a pointer to an
IElementBehaviorSite interface. Our application then calls
QueryInterface() on the
IElementBehaviorSite requesting an
IHTMLPaintSite interface. Once we get the
IHTMLPaintSite interface we invalidate the rectangle it represents which, since we requested it on the body of our HTML document, means we're invalidating the entire MSHTML display surface. MSHTML obliges by repainting the display surface and, in the process, requests an
IHTMLPainter interface and calls its
Draw() method, which is where we draw the grid. Phew!
Maybe a diagram will help
Arrow endpoints indicate the destination of the interface.
The Grid code
I won't clutter this article with repeated blocks of
END_INTERFACE_PART macros. I'll assume you understand how the MFC COM Interface Macros work and continue with the code of interest. Let's look first at the code that initiates the entire process, the code that installs the grid handler. This is part of the outer class,
CHTMLEditControlSite and it's called by our application.
void CHTMLEditControlSite::InstallGrid(IHTMLDocument2 *pDoc)
IHTMLElement *pBody = NULL;
if (pDoc == (IHTMLDocument2 *) NULL)
hr = pDoc->get_body(&pBody);
if (pBody == (IHTMLElement *) NULL)
hr = pBody->QueryInterface(IID_IHTMLElement2, (void **) &pBody2);
if (pBody2 == (IHTMLElement2 *) NULL)
hr = pBody2->removeBehavior(m_gridCookie, &dummy);
m_gridCookie = NULL;
V_VT(&vFactory) = VT_UNKNOWN;
V_UNKNOWN(&vFactory) = &m_xHTMLElementBehaviorFactory;
hr = pBody2->addBehavior(NULL, &vFactory, &m_gridCookie);
hr = pBody->Release();
hr = pBody2->Release();
this starts out by obtaining an
interface to the body of the document. Once we've got that we get an
interface. When we've got our
interface we call
on it passing a pointer to our element behavior factory.
returns us a cookie which we'll need later to remove the behavior.
Not much of interest happens until MSHTML has requested a bunch of other interfaces from us. Our behavior factory is called and we return a pointer to our
IElementBehavior interface. MSHTML then calls our
IElementBehavior::Init() method which looks like this.
HRESULT hr = pBehaviorSite->QueryInterface(IID_IHTMLPaintSite,
(void **) &m_spPaintSite);
if (m_spPaintSite != (IHTMLPaintSite *) NULL)
The method receives an
interface pointer. Not by coincidence this represents the body object in the document (it's the body because we used the body interface when we registered the behavior factory). We get a pointer to an
interface through the behavior site interface. Once we've got that we can invalidate the display surface and force repaints whenever we want.
Meantime MSHTML queries us for an
IHTMLPainter interface. One thing it needs to know is our Z-order. Should we be called first so MSHTML can paint stuff over what we've painted, or do we get last crack at the display surface? So MSHTML calls our
IHTMLPainter::GetPainterInfo()method which looks like this.
if (pInfo == NULL)
pInfo->lFlags = HTMLPAINTER_TRANSPARENT;
pInfo->lZOrder = HTMLPAINT_ZORDER_BELOW_CONTENT;
memset(&pInfo->iidDrawObject, 0, sizeof(IID));
pInfo->rcExpand.left = 0;
pInfo->rcExpand.right = 0;
pInfo->rcExpand.top = 0;
pInfo->rcExpand.bottom = 0;
which tells MSHTML to call us first.
Now it's drawing time. This code is trivial. MSHTML calls our
IHTMLPainter::Draw() method giving us the device context we should paint onto. All we do is draw our grid.
if (m_bGrid != FALSE)
HPEN redPen = (HPEN) CreatePen(PS_DOT, 0, RGB(0xff, 0x99, 0x99));
HPEN oldPen = (HPEN) SelectObject(hdc, redPen);
long lFirstLine = rcBounds.left + m_iGrid;
for (int i = lFirstLine; i <= rcBounds.right; i += m_iGrid)
MoveToEx(hdc, i, rcBounds.top, NULL);
LineTo(hdc, i, rcBounds.bottom);
lFirstLine = rcBounds.top + m_iGrid;
for (i = lFirstLine ; i <= rcBounds.bottom; i += m_iGrid)
MoveToEx(hdc, rcBounds.left, i, NULL);
LineTo(hdc, rcBounds.right, i);
Using the code
derived view class header you need to add this function prototype. It's not documented in MSDN but fortunately it's a
virtual BOOL CreateControlSite(COleControlContainer* pContainer,
UINT nID, REFCLSID clsid);
and add a
member variable to the data declarations in the header. I call it
. Add this function to your view implementation file.
COleControlSite** ppSite, UINT ,
ASSERT(ppSite != NULL);
*ppSite = m_pEditSite = new CHTMLEditControlSite(pContainer);
At an appropriate place in your view class (maybe
) add a call to
m_pDoc = (IHTMLDocument2 *) GetHtmlDocument();
Once you've got the grid installed for this document you can toggle it off and on by calling
FALSE. If you navigate to another document you must call
CHTMLEditControlSite::InstallGrid() again to reinstall the grid handler.
AddRef() and Release() notes
If you look through the source code you may notice that in some places I handle
Release() correctly whilst in other places I don't. This isn't laziness or lack of knowledge by me. The fact is that, unless I've seriously misunderstood COM reference counting, MSHTML doesn't seem to fully follow the reference counting rules. Sometimes it calls
Release() on the interface pointers we gave it, sometimes it doesn't. Our
CHTMLEditControlSite class is derived from
CCmdTarget which implements reference counting on our behalf. As I discussed in my previous article MFC COM Interface Macros[^] the
CCmdTarget destructor asserts (in debug builds) that the reference counter is less than or equal to 1. I found out by trial and error which interfaces ought to implement correct reference counting and which ones oughtn't. As it happens it doesn't matter that we're not correctly implementing COM reference counting given that our class is embedded as a member of the view class and won't go away until the view class goes away. One could argue that in this instance no interface within our class need implement correct reference counting but I prefer to do it the correct way whenever I can.
Notes on the demo program
The demo uses a hardcoded reference to an HTML page which, in turn has a hardcoded reference to the image file. You may need to tweak either the HTML page reference or the image reference within the HTML page.
25 April 2004 - Initial version.