Click here to Skip to main content
Click here to Skip to main content
Go to top

HTML filter to ban pop ups of all kinds

, 11 Oct 2002
Rate this:
Please Sign up or sign in to vote.
Free tool to close pop up windows

Filtering out HTML content

Introduction

HTML Filter is a tool I wrote after being fed up with pop up windows of all kinds.

Applying HTML filtering to close automatically pop up windows, is of course a quite effective application. But that's not the only one, which actually brings a lot more interest to the technique.

HTML Filter?

I have spent much time trying to close pop ups automatically by using "external tools" such as this one. External tools check out open windows on a regular time basis against a known dictionary of banned window names. This works fine, as long as you're happy with being forced to add new entries every other day since ad names keep changing all the time.

So I had to find out a more internal way of doing it. I found that, after trying to work along with IE a couple of times already, there were many limits and weird things happening there with subscribed events, which for any reason sometimes don't trigger at all. I thought I had to find something more radical, and less coupled with the navigator I was using.

I finally successfully came up with a proxy filter, a systray tool which, once configured, sends back and forth, every HTTP packet, with the unique opportunity of seeing the HTML content itself.

This opportunity is great. Applying predefined filtering rules allows for instance to remove all kinds of nasty JavaScript known for bringing pop ups on screen, you know that nasty window.open (url, "xyz", ...) things.

Configuring the tool

Once installed, it starts listening on the default 8010 port. If you are already using this port, change it, that's what the dialog box is for. Of course, you must let the navigator know that you are listening there, so let's open the Windows control panel, then double-click on Internet Options. In the Connections tab, just edit the Proxy Settings, click on Advanced, and type 127.0.0.1 in front of HTTP Proxy address to use - Server field, and type 8010 in the Port field. Press "Apply". Ok, you're done. You can go back and surf the web as you previously did, without notable changes (at least on surface).

If you are using Netscape or even Opera, just change the proxy settings using a similar procedure. For Netscape, go in the Edit / Preferences, then in Advanced / Proxy, and edit the HTTP Proxy field.

The filter is automatically activated, which means the HTML content going through it is filtered, and rules are applied. The source code provided filters window.open statements, replacing them with faked //ndow.open and it is up to you to add any other relevant rules in the CHtmlFilterRules class implementation. To disable filtering, just right-click in the systray and choose the option.

I also wanted the tool not to slow down the surfing experience. This goal is achieved by using simple sockets instead of MFC wrappers such like CAsyncSocket (which in turn mess a lot around with the _afxThreadSockState mess).

Technical details

This tool acts as a proxy server. It basically implements a double-threaded socket line. The code is based on Nish's pop proxy server. How things work is depicted below :

How the html filter works

The main class is declared as below :

class CHttpProxyMT
{

    // Members
    //
protected:
    SOCKET m_HttpServerSocket;
    HANDLE m_ServerThread;
    int m_port;
    BOOL bRunning; // Flag that's set if the proxy is running

    // Constructor
    //
public:
    CHttpProxyMT();
    virtual ~CHttpProxyMT();


    // Methods
    //
public :
    // This starts the multi-threaded HTTP proxy server
    BOOL StartProxy(int port);
    BOOL IsRunning();
    void StopProxy();

    int GetProxyPort();
    int GetNBConnections();
    void EnableFiltering(BOOL bEnable=TRUE);

private:
    // The thread that listens for connections
    static DWORD ServerThread(void *arg); // thread callback
    DWORD MServerThread();

    static DWORD ClientThread(DWORD arg); // thread callback
    void StartClientThread(SOCKET sock);

    static void StartDataThread(void *parm); // thread callback
    static DWORD DataThread(void *parm); // thread callback

};

struct socket_pair
{
    socket_pair(SOCKET s1, SOCKET s2, BOOL bIsServerResponse)
    {
        srcsock = s1;
        dstsock = s2;
        m_bIsServerResponse = bIsServerResponse;
        n = 0;
    }

    SOCKET srcsock;
    SOCKET dstsock;
    BOOL m_bIsServerResponse;
    int n;
    char buff[16384+1];
};

What's funny is when you start working with threads, suddenly, everything comes so messed up. Indeed, every variable is under the potential fire of being accessed by several threads at the same time, making it just harder to code practically anything. I ended up associating a socket pair instance to each thread and basically referring to this object in every line of code, so to make sure I was sort of thread-safe. But it sucks, what one needs at this particular moment is an easy framework to attach variables and maps to the running thread. It becomes so amazing just because under Win32 the thread callback is a static (read global) function, thus used and reused by each thread.

In the end, I have code like this when it comes to sending server responses back to the client :

DWORD CHttpProxyMT::DataThread(void *parm)
{
    socket_pair* spair = (socket_pair*) parm;

    // recv bytes from server and send 
    //them back to the client, once filtered
    while( (spair->n=recv(spair->srcsock, spair->buff, 16384, 0))>0 )
    {
        spair->buff[spair->n] = 0;

        if (g_bFilteringEnabled && spair->m_bIsServerResponse)
        {
            CHtmlFilterRules filter( spair->buff,spair->n );
            filter.ApplyRules();
        }

        send(spair->dstsock, spair->buff, spair->n, 0);
    }

    ...
}

Applying rules is up to what you intend to do. Basically I wanted to comment out pop up window JavaScript code, but virtually the concept can be used for many other applications. Using filtering to forbid pop up windows comes as a consequence to the fact that, in HTML, the way to open pop up windows is through the window.open JavaScript command. Commenting out this line makes it KO, which is what we are looking for. Here is the code for it :

CHtmlFilterRules::CHtmlFilterRules(char *buffer, int nLength)
{
    m_cpBuffer = buffer;
    m_nLength = nLength;
}

BOOL CHtmlFilterRules::ApplyRules()
{
    // we are already done!
    if (!m_cpBuffer || !m_nLength) return FALSE;

    // copy the buffer, in order to be able to 
    //compare the strings regardless of the case
    char *buf = new char[m_nLength+1];
    if (!buf) return FALSE;
    memcpy(buf, m_cpBuffer, m_nLength);

    buf[m_nLength]=0; // force EOL to allow str C-routines to work
    strlwr(buf); // convert to lowercase (CPU time here)


    char *szPattern = buf;
    char *szFirstByte = buf;
    while ( (szPattern=strstr(szPattern,"window.open"))!=NULL )
    {
        // replace window.open by //ndow.open 
        // so the javascript code doesn't create any annoying popup
        m_cpBuffer[szPattern-szFirstByte+0] = '/';
        m_cpBuffer[szPattern-szFirstByte+1] = '/';

        szPattern++;
    }

    delete [] buf;

    return TRUE;
}

Code listing: (both VC6 and VC7 workspaces provided)

  • HtmlFilterRules.cpp : HTML filter
  • HttpProxyMT.cpp : based on Nish's PopProxyMT multi-threaded POP proxy server
  • htmlfilterdlg.cpp : port configuration, menu commands
  • htmlfilter.cpp : Win app
  • TrayNot.cpp : simple systray implementation

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

Addicted to reverse engineering. At work, I am developing business intelligence software in a team of smart people (independent software vendor).
 
Need a fast Excel generation component? Try xlsgen.
 

Comments and Discussions

 
GeneralUpdated now as a commercial product PinmemberStephane Rodriguez.13-May-06 1:52 
GeneralGood JOB~~~ Pinmemberfishacker28-Jun-05 15:50 
GeneralI wait for you so long, so I........( une long temps) Pinmemberbaoiph18-Apr-04 15:36 
GeneralRe: I wait for you so long, so I........( une long temps) PinsussAnonymous one18-Apr-04 19:12 
GeneralRe: I wait for you so long, so I........( une long temps) Pinmemberbaoiph18-Apr-04 21:26 
GeneralRe: I wait for you so long, so I........( une long temps) PinmemberStephane Rodriguez.19-Apr-04 6:12 
GeneralYour email? Stephane Rodriguez Pinmemberbaoiph17-Apr-04 16:28 
Generalbuffer question PinmemberHockey26-Feb-04 21:38 
GeneralRe: buffer question PinmemberStephane Rodriguez.26-Feb-04 22:39 
GeneralRe: buffer question PinmemberHockey27-Feb-04 11:06 
Generalexample site of missing image.. PinmemberParas16-Feb-04 8:27 
GeneralProblem with dialup PinmemberParas16-Feb-04 8:19 
QuestionCan the Port settings be done programatically PinmembermailMonty12-Jan-04 1:16 
AnswerRe: Can the Port settings be done programatically PinmemberStephane Rodriguez.12-Jan-04 2:34 
GeneralBehind Proxies PinmembermailMonty12-Jan-04 0:16 
GeneralRe: Behind Proxies Pinmembermicutzu25-Jan-04 23:49 
Generaldoesn't seem to work in XP Pinmemberdaveandstuff1-Jul-03 15:57 
GeneralRe: doesn't seem to work in XP Pinmembermaqsoodr3-Jul-03 22:42 
GeneralRe: doesn't seem to work in XP PinsussAnonymous25-May-04 18:55 
GeneralRe: doesn't seem to work in XP PinmemberStephane Rodriguez.25-May-04 19:04 
GeneralInvalid Request Pinmemberwaelsalah218-Jan-03 1:56 
GeneralRe: Invalid Request Pinmember.S.Rod.18-Jan-03 2:13 
GeneralVery good !! Pinmemberphilippe dykmans30-Oct-02 6:01 
GeneralRe: Very good !! Pinmember.S.Rod.30-Oct-02 6:05 
Questionextra Feature ? PinsussFlow de15-Oct-02 6:43 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web03 | 2.8.140916.1 | Last Updated 12 Oct 2002
Article Copyright 2002 by Stephane Rodriguez.
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid