Click here to Skip to main content
15,892,697 members
Articles / Web Development / ASP.NET
Article

Microsoft Web Browser Automation using C#

Rate me:
Please Sign up or sign in to vote.
4.81/5 (88 votes)
16 Nov 20032 min read 854.7K   19.5K   164   178
An article on axWebBrowser/MSHTML automation using Visual C#.

Sample Image - mshtml_automation.jpg

Introduction

The Microsoft Web Browser COM control adds browsing, document, viewing, and downloading capabilities to your applications. Parsing and rendering of HTML documents in the WebBrowser control is handled by the MSHTML component which is an Active Document Dynamic HTML (DHTML) Object Model hosting ActiveX Controls and script languages. The WebBrowser control merely acts as a container for the MSHTML component and implements navigations and related functions. MSHTML can be automated using IDispatch and IConnectionPointContainer-style automation interfaces. These interfaces enable a host to automate MSHTML through the object model.

Note

If you are not using the Visual Studio .NET IDE; use Windows Forms ActiveX Control Importer (Aximp.exe) to convert type definitions in a COM type library for an ActiveX control into a Windows Forms control. For instance: to generate the interop DLL's for the ActiveX browser component using the command line run aximp ..\system32\shdocvw.dll relative to your system32 path. Compilation of a form that uses the AxSHDocVw.AxWebBrowser class would be as follows: csc /r:SHDocVw.dll,AxSHDocVw.dll YourForm.cs.

Using the code

Simple Automation scenario:

Image 2

In order to automate this task, first add a Microsoft Web Browser object to an empty C# Windows application. In the Visual Studio .NET IDE, this is done by using the "Customize Toolbox..." context menu (on the Toolbox), pick "Microsoft Web Browser" from the COM components list. This will add an "Explorer" control in the "General" section of the Toolbox.

C#
//
// navigate to google on Form load
//
private void Form1_Load(object sender, System.EventArgs e)
{
    object loc = "<A href="http://www.google.com/">http://www.google.com/</A>";
    object null_obj_str = "";
    System.Object null_obj = 0;
    this.axWebBrowser1.Navigate2(ref loc , ref null_obj, 
          ref null_obj, ref null_obj_str, ref null_obj_str);
}

Next open the solution explorer and add a reference to the Microsoft HTML Object Library (MSHTML) from the COM components list and implement the following code.

C#
//
// Global variable Task used to prevent recursive code executions.
// 

using mshtml;

private int Task = 1; // global

private void axWebBrowser1_DocumentComplete(object sender, 
         AxSHDocVw.DWebBrowserEvents2_DocumentCompleteEvent e)

{
switch(Task)
    {
        case 1:

            HTMLDocument myDoc = new HTMLDocumentClass();
            myDoc = (HTMLDocument) axWebBrowser1.Document;

            // a quick look at the google html source reveals: 
            // <INPUT maxLength="256" size="55" name="q">
            //
            HTMLInputElement otxtSearchBox = 
               (HTMLInputElement) myDoc.all.item("q", 0);

            otxtSearchBox.value = "intel corp";

            // google html source for the I'm Feeling Lucky Button:
            // <INPUT type=submit value="I'm Feeling Lucky" name=btnI>
            //
            HTMLInputElement btnSearch = 
               (HTMLInputElement) myDoc.all.item("btnI", 0);
            btnSearch.click();

            Task++;
            break;

        case 2:

            // continuation of automated tasks...
            break;
    }
}

References

MSDN

History

  • Version 1.0 - November 16th 2003 - Original Submission
  • Version 1.1 - November 17th 2003 - Modified axWebBrowser event

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Kentdome LLC
United States United States
Biography in progress Wink | ;-)

Comments and Discussions

 
GeneralCool article... Pin
Kentamanos17-Nov-03 10:18
Kentamanos17-Nov-03 10:18 
GeneralCorrections Pin
Brian Shifrin17-Nov-03 4:54
Brian Shifrin17-Nov-03 4:54 
GeneralRe: Corrections Pin
Alexander Kent17-Nov-03 7:24
Alexander Kent17-Nov-03 7:24 
GeneralRe: Corrections Pin
Brian Shifrin17-Nov-03 14:08
Brian Shifrin17-Nov-03 14:08 
GeneralRe: Corrections Pin
Alexander Kent17-Nov-03 14:50
Alexander Kent17-Nov-03 14:50 
GeneralRe: Corrections Pin
Frank Meffert18-Nov-03 11:39
Frank Meffert18-Nov-03 11:39 
GeneralRe: Corrections Pin
mjzalewski19-Nov-03 12:42
mjzalewski19-Nov-03 12:42 
GeneralRe: Corrections Pin
Brian Shifrin27-Nov-03 15:22
Brian Shifrin27-Nov-03 15:22 
In the past I wrote custom moniker, 98% emulating http(s). From my experience emulating real http protocol:

browser can create 1 or many monikers. Calls to a custom moniker are not synchronized, in fact ReadData could be initiated by "free threaded" on some versions of IE.... IE as a host is clueless about end-of-data, in fact IE would ask 2 - 3 times moniker if EOD reached.

Page download complete is fired after 2 -3 calls to urlMon return HRESULT of an error. Pluggable protocol also reports all headers, including content-length. ( I do not know if IE uses that value ), if it does that would likely hardwire IE to http style headers. Next IE fires Download_Complete event. Generaly speaking DOM is not awailable at that time. Next time browser may fire Document_Complete. IE6 has/had bug were first document complete is not properly fired for the first instance navigation. IE uses large number of threads to perform download & create DOM.

Again my experience with http proxy APP and VC6 showed you can not get away with Download_Complete, in fact even if both Download_complete & Document_Complete recieved your DOM may still be NULL....


GeneralRe: Corrections Pin
rcsrinivas22-Jan-04 10:50
rcsrinivas22-Jan-04 10:50 
GeneralRe: Corrections Pin
Jasper4C#11-Nov-04 3:52
Jasper4C#11-Nov-04 3:52 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.