Click here to Skip to main content
Licence 
First Posted 4 May 2004
Views 31,217
Downloads 437
Bookmarked 15 times

Internet web macros in C#

By | 4 May 2004 | Article
Write web macro agents with plugin libraries for data processing

Introduction

Sometimes you need to retrieve/submit information from the web in your applications but you don't want to write a full library for it. You would rather focus on your specific need, assuming you already got the information from the web in a HTML or mshtml DOM format.

So, KUMO is done for you. You can call your own objects in the macro, defined as plug Ins, and you can export your web macro as DLLs or .EXE objects.

Background

In 1998 Compaq introduced a web language to automate actions on the web. http://research.compaq.com/SRC/WebL/. The project stopped and a few commercial software or Java frameworks are now providing web automation functionalities. Unfortunately nothing serious never appeared on .NET.

Using the code

The code is based on the KUMO web macro methodology: a web macro is written in modified C# instructions. The modified C# instructions of the macro are the ## instructions that simply mean that the macro has to wait for the browser to have finished other work to move further. Another property of the ## instructions is that the return type does not need to be declared. The web macro uses 3 objects: SPBrowser, SPBrowserObject, SPBrowserCollection. SPBrowser represents the current browser, whereas SPBrowserObject is a wrapper of a mshtml.IHTMLElement object, a SPBrowserCollection is an array of SPBrowserObject.

By writing your own .NET DLL implementing the KUMOFrwk.Plugin.IPlugin interface and putting it in the /Plugins directory under the installation folder of KUMO, you will be able to add your own custom methods on the 3 objects SPBrowser, SPBrowserObject, SPBrowserCollection. You will be able to see the methods in KUMO editor that has an AutoComplete feature that recognizes plugins.

To give a simple example I implement the getEmails() function of the plugin ContactPlugin that I describe later :

//
// Navigate to Google advertisement page
## browser.goToURL("http://www.google.com/jobs/eng.html"); 

## emails = browser.getEmails(); 
if (emails.Length>0) 
{ 
    MessageBox.Show(emails[0]); 
}

The plugIn source code is available under the Download Source code. The important part is the function doFunction that will be launched by KUMO. The function defined here will search in all objects of the current web page those that look like an email. Of course there are several way to optimize this function to get faster results, but this is not the point of this article.

public object doFunction(params object[] allparameters) 
{ 
    // In that case there is no need to use any of the input parameter. 
    string[] allEmails; 
    string strRegex = @"^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}" 
      + @"\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\" + 
      @".)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$"; 
    Regex emailReg = new Regex(strRegex); 
    doc = (mshtml.HTMLDocument)localBrowser.Document; 
    mshtml.IHTMLElementCollection allTags = doc.all; 
    System.Collections.Queue aQueue = new Queue(); 
    foreach (mshtml.IHTMLElement anObj in allTags) 
    { 
        if (anObj.innerText != null) 
        { 
            if (anObj.innerText != "") 
            { 
                if ((emailReg.IsMatch(anObj.innerText))&
                    (anObj.innerText!="")) aQueue.Enqueue(
                     anObj.innerText); 
            } 
         } 
    } 
    allEmails = new string[aQueue.Count]; 
    for (int i=0; i<aQueue.Count;i++) 
    { 
        allEmails[i]=(string)aQueue.Dequeue(); 
    } 
    return allEmails; 
 }

Points of Interest

Download KUMO on www.softmorning.net

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

JEHAN Sebastien



France France

Member



Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board. (secure sign-in)
 
Search this forum  
 FAQ
    Noise  Layout  Per page   
  Refresh
GeneralKumo new location PinmemberPKS16:25 28 Mar '11  
GeneralSource code PinmemberJEHAN Sebastien12:05 25 May '04  
GeneralKUMO seems to crash my server (win 2003) PinsussDoug Adams10:57 18 May '04  
GeneralRe: KUMO seems to crash my server (win 2003) PinmemberS. Jehan12:38 18 May '04  

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Mobile
Web01 | 2.5.120517.1 | Last Updated 5 May 2004
Article Copyright 2004 by JEHAN Sebastien
Everything else Copyright © CodeProject, 1999-2012
Terms of Use
Layout: fixed | fluid