WebBrowser control in a Windows form has become trivially simple. However, any app that does this will probably want to manipulate the content displayed by that
WebBrowser. Perhaps, the most basic manipulation is hooking into the
DocumentComplete event; before you do anything with the page, you'll need to know that the page - the DOM - is ready for you to begin hacking on it.
DocumentComplete won't fire on all systems! Search the web for "DocumentComplete not firing" - it's a surprisingly widespread problem. The core reason is that the Microsoft.Mshtml assembly isn't installed on all machines. Compounding the problem is the fact that Visual Studio does install the assembly (so do a few Office apps), so you'll never see the bug on a dev system. In addition to not receiving
DocumentComplete, without MSHTML, you'll not be able to do anything useful with the DOM and will get exceptions should you try.
This article is my solution to the problem. I created a class that traps the absence of Microsoft.mshtml and installs it as needed, all without bothering the user who likely wouldn't know what MSHTML is, let alone know about gacutil.
This article requires familiarity with the WebBrowser control and the use of .NET assemblies.
It'll be very useful to have/find a machine that doesn't have MSHTML installed.
Using the code
The attached demo project outlines the technique. The semo implements a simple form with an embedded
WebBrowser; upon loading, the browser is navigated to codeproject.com; when the
DocumentComplete fires, I show a
The important stuff is in MshtmlAssemblyHelper.cs, but first let me describe some items that will allow you to hook into that functionality.
Directing the CLR to use the correct version of MSHTML
First, you need to tell your app that you require MSHTML version 7.0.3300.0 or higher, which also covers the "Mshtml not installed" case. This is done via an entry in your app.config file:
<runtime> block will instruct the CLR to use MSHTML v7.0.3300.0, and if it can't find that version (or higher, or any MSHTML assembly), the CLR should ask my app to locate the assembly DLL. This last bit - Assembly Resolution - is a central trick to the demo, and will be discussed more below. The most relevant bits of the above XML are the
newVersion. I set the
oldVersion to be anything lower than 7.0.3300.0 (essentially any version). The
newVersion is my target version: the version of my known-good MSHTML assembly that was installed with my copy of VS2008.
An additional item to note is that I do not have a reference to MSHTML in my project. With the demo, the CLR's desire to load MSHTML is purely driven by the entry in app.config. Of course, any app that wishes to touch the DOM in the
WebBrowser would need to add a reference to MSHTML, but I specifically omitted that in the demo to highlight the role of app.config.
System.ResolveEventHandler is a delegate that the CLR will call when it needs your help in finding an assembly. Because of the entries in app.config, if the the CLR fails to find a copy of MSHTML version >= 7.0.3300.0 installed in the GAC, the CLR will invoke your ResolveEventHandler(s) to allow you to supply a valid assembly. Should you not supply a valid assembly, the CLR will throw an exception.
Hooking into MshtmlAssemblyHelper
Program.cs is boilerplate with one exception:
Initialize does this:
which simply adds my method (
MyResolveEventHandler) to the list of callbacks.
MyResolveEventHandler simply checks that MSHTML is being requested, and if so, obtains the path of my known-good MSHTML assembly, loads it, and returns the assembly to the CLR:
static private Assembly MyResolveEventHandler(object sender, ResolveEventArgs args)
AssemblyInfo info = new AssemblyInfo(args.Name);
if (info.Name.ToLower() != MshtmlAssemblyName.ToLower())
string assemblyPath = MshtmlAssemblyHelper.Install();
Assembly assembly = Assembly.LoadFrom(assemblyPath);
The "special" stuff is performed by
Install(). I previously mentioned "my known-good MSHTML assembly", and I now need to expand on that. Because the most common reason for
MyResolveEventHandler() getting called is the absence of any copies of MSHTML on the system, in order to load a copy, you must first install it on the local system. My (low-tech) solution was to place a copy of my MSHTML assembly (search your C drive for Microsoft.mshtml.dll) in a network folder and have
MshtmlAssemblyHelper copy the file from that directory to the local system, then install it via gacutil. That final step - installation with gacutil - is simply a convenience. You can load an uninstalled copy of MSHTML and the CLR will not complain. The benefit of installing it is that the next time your app runs, the CLR will find MSHTML in the GAC and thus won't need to call your
Note that I use a mapped network drive to retrieve the installable DLL. My original app is an in-house tool; thus, using a net-drive-based retrieval mechanism was the quick and obvious choice. If your app were to be deployed outside your net, you'd want to implement some sort of Internet (e.g., HTTP) based DLL retrieval mechanism. I purposely omitted that functionality here as I wanted to focus on
ResolveEventHandler rather than the network stuff.
Testing the code
As I mentioned above, VS2008 installs MSHTML, thus
MyResolveEventHandler will never be invoked on a dev system. You can, however, force the CLR to call
MyResolveEventHandler on your system by hacking the app.config entry:
<bindingRedirect oldVersion="220.127.116.11-7.9.9999.9" newVersion="18.104.22.168" />
This causes the CLR to consider your valid MSHTML assembly to be downrev, thus forcing
MyResolveEventHandler to be invoked. You'll then return MSHTML v7.0.3300.0 and the CLR won't like that (it's downrev), but it does allow you to debug your
ResolveEventHandler. A copy of the above test
bindingRedirect is in the demo's app.config, commented out.