Click here to Skip to main content
15,881,424 members
Articles / Desktop Programming / MFC

HTML Filter to Ban Pop Ups of All Kinds

Rate me:
Please Sign up or sign in to vote.
4.39/5 (19 votes)
11 Oct 20024 min read 126.5K   1.8K   47   35
Free tool to close pop up windows
This tool acts as a proxy server, and basically implements a double-threaded socket line.

Image 1

Filtering out HTML content

Introduction

HTML Filter is a tool I wrote after being fed up with pop up windows of all kinds.

Applying HTML filtering to automatically close pop up windows, is of course quite an effective application. But that's not the only one, which actually brings a lot more interest to the technique.

HTML Filter?

I have spent much time trying to close pop ups automatically by using "external tools" such as this one. External tools check out open windows on a regular time basis against a known dictionary of banned window names. This works fine, as long as you're happy with being forced to add new entries every other day since ad names keep changing all the time.

So I had to find out a more internal way of doing it. I found that, after trying to work along with IE a couple of times already, there were many limits and weird things happening there with subscribed events, which for any reason sometimes don't trigger at all. I thought I had to find something more radical, and less coupled with the navigator I was using.

I finally successfully came up with a proxy filter, a systray tool which, once configured, sends back and forth, every HTTP packet, with the unique opportunity of seeing the HTML content itself.

This opportunity is great. Applying predefined filtering rules allows for instance to remove all kinds of nasty JavaScript known for bringing pop ups on screen, you know that nasty window.open (url, "xyz", ...) things.

Configuring the Tool

Once installed, it starts listening on the default 8010 port. If you are already using this port, change it, that's what the dialog box is for. Of course, you must let the navigator know that you are listening there, so let's open the Windows control panel, then double-click on Internet Options. In the Connections tab, just edit the Proxy Settings, click on Advanced, and type 127.0.0.1 in front of HTTP Proxy address to use - Server field, and type 8010 in the Port field. Press "Apply". Ok, you're done. You can go back and surf the web as you previously did, without notable changes (at least on surface).

If you are using Netscape or even Opera, just change the proxy settings using a similar procedure. For Netscape, go in the Edit / Preferences, then in Advanced / Proxy, and edit the HTTP Proxy field.

The filter is automatically activated, which means the HTML content going through it is filtered, and rules are applied. The source code provided filters window.open statements, replacing them with faked //ndow.open and it is up to you to add any other relevant rules in the CHtmlFilterRules class implementation. To disable filtering, just right-click in the systray and choose the option.

I also wanted the tool not to slow down the surfing experience. This goal is achieved by using simple sockets instead of MFC wrappers such like CAsyncSocket (which in turn mess around a lot with the _afxThreadSockState mess).

Technical Details

This tool acts as a proxy server. It basically implements a double-threaded socket line. The code is based on Nish's pop proxy server. How things work is depicted below:

Image 2

How the HTML filter works

The main class is declared as below:

C++
class CHttpProxyMT
{
    // Members
    //
protected:
    SOCKET m_HttpServerSocket;
    HANDLE m_ServerThread;
    int m_port;
    BOOL bRunning; // Flag that's set if the proxy is running

    // Constructor
    //
public:
    CHttpProxyMT();
    virtual ~CHttpProxyMT();

    // Methods
    //
public :
    // This starts the multi-threaded HTTP proxy server
    BOOL StartProxy(int port);
    BOOL IsRunning();
    void StopProxy();

    int GetProxyPort();
    int GetNBConnections();
    void EnableFiltering(BOOL bEnable=TRUE);

private:
    // The thread that listens for connections
    static DWORD ServerThread(void *arg); // thread callback
    DWORD MServerThread();

    static DWORD ClientThread(DWORD arg); // thread callback
    void StartClientThread(SOCKET sock);

    static void StartDataThread(void *parm); // thread callback
    static DWORD DataThread(void *parm); // thread callback
};

struct socket_pair
{
    socket_pair(SOCKET s1, SOCKET s2, BOOL bIsServerResponse)
    {
        srcsock = s1;
        dstsock = s2;
        m_bIsServerResponse = bIsServerResponse;
        n = 0;
    }

    SOCKET srcsock;
    SOCKET dstsock;
    BOOL m_bIsServerResponse;
    int n;
    char buff[16384+1];
};

What's funny is when you start working with threads, suddenly, everything comes so messed up. Indeed, every variable is under the potential fire of being accessed by several threads at the same time, making it just harder to code practically anything. I ended up associating a socket pair instance to each thread and basically referring to this object in every line of code, so to make sure I was sort of thread-safe. But it sucks, what one needs at this particular moment is an easy framework to attach variables and maps to the running thread. It becomes so amazing just because under Win32 the thread callback is a static (read global) function, thus used and reused by each thread.

In the end, I have code like this when it comes to sending server responses back to the client:

C++
DWORD CHttpProxyMT::DataThread(void *parm)
{
    socket_pair* spair = (socket_pair*) parm;

    // recv bytes from server and send 
    //them back to the client, once filtered
    while( (spair->n=recv(spair->srcsock, spair->buff, 16384, 0))>0 )
    {
        spair->buff[spair->n] = 0;

        if (g_bFilteringEnabled && spair->m_bIsServerResponse)
        {
            CHtmlFilterRules filter( spair->buff,spair->n );
            filter.ApplyRules();
        }

        send(spair->dstsock, spair->buff, spair->n, 0);
    }

    ...
}

Applying rules is up to what you intend to do. Basically, I wanted to comment out pop up window JavaScript code, but virtually the concept can be used for many other applications. Using filtering to forbid pop up windows comes as a consequence to the fact that, in HTML, the way to open pop up windows is through the window.open JavaScript command. Commenting out this line makes it KO, which is what we are looking for. Here is the code for it:

C++
CHtmlFilterRules::CHtmlFilterRules(char *buffer, int nLength)
{
    m_cpBuffer = buffer;
    m_nLength = nLength;
}

BOOL CHtmlFilterRules::ApplyRules()
{
    // we are already done!
    if (!m_cpBuffer || !m_nLength) return FALSE;

    // copy the buffer, in order to be able to 
    //compare the strings regardless of the case
    char *buf = new char[m_nLength+1];
    if (!buf) return FALSE;
    memcpy(buf, m_cpBuffer, m_nLength);

    buf[m_nLength]=0; // force EOL to allow str C-routines to work
    strlwr(buf); // convert to lowercase (CPU time here)

    char *szPattern = buf;
    char *szFirstByte = buf;
    while ( (szPattern=strstr(szPattern,"window.open"))!=NULL )
    {
        // replace window.open by //ndow.open 
        // so the javascript code doesn't create any annoying popup
        m_cpBuffer[szPattern-szFirstByte+0] = '/';
        m_cpBuffer[szPattern-szFirstByte+1] = '/';

        szPattern++;
    }

    delete [] buf;

    return TRUE;
}

Code Listing (Both VC6 and VC7 Workspaces Provided)

  • HtmlFilterRules.cpp: HTML filter
  • HttpProxyMT.cpp: based on Nish's PopProxyMT multi-threaded POP proxy server
  • htmlfilterdlg.cpp: port configuration, menu commands
  • htmlfilter.cpp: Win app
  • TrayNot.cpp: simple systray implementation

History

  • 12th October, 2002: Initial version

License

This article has no explicit license attached to it, but may contain usage terms in the article text or the download files themselves. If in doubt, please contact the author via the discussion board below.

A list of licenses authors might use can be found here.


Written By
France France
Addicted to reverse engineering. At work, I am developing business intelligence software in a team of smart people (independent software vendor).

Need a fast Excel generation component? Try xlsgen.

Comments and Discussions

 
GeneralInvalid Request Pin
waelsalah218-Jan-03 1:56
waelsalah218-Jan-03 1:56 
GeneralRe: Invalid Request Pin
Stephane Rodriguez.18-Jan-03 2:13
Stephane Rodriguez.18-Jan-03 2:13 
GeneralVery good !! Pin
philippe dykmans30-Oct-02 6:01
philippe dykmans30-Oct-02 6:01 
GeneralRe: Very good !! Pin
Stephane Rodriguez.30-Oct-02 6:05
Stephane Rodriguez.30-Oct-02 6:05 
Questionextra Feature ? Pin
Flow de15-Oct-02 6:43
Flow de15-Oct-02 6:43 
AnswerRe: extra Feature ? Pin
Stephane Rodriguez.15-Oct-02 8:02
Stephane Rodriguez.15-Oct-02 8:02 
GeneralRe: extra Feature ? Pin
#realJSOP24-Oct-02 0:05
mve#realJSOP24-Oct-02 0:05 
GeneralRe: extra Feature ? Pin
Stephane Rodriguez.24-Oct-02 0:47
Stephane Rodriguez.24-Oct-02 0:47 
I have good news for you. This is done[^].;)



How low can you go ?
(MS rant)

GeneralI wonder who's the lamer who voted 1 Pin
12-Oct-02 21:30
suss12-Oct-02 21:30 
GeneralRe: I wonder who's the lamer who voted 1 Pin
Daniel Turini12-Oct-02 21:53
Daniel Turini12-Oct-02 21:53 
GeneralRe: I wonder who's the lamer who voted 1 Pin
Stephane Rodriguez.12-Oct-02 22:47
Stephane Rodriguez.12-Oct-02 22:47 
GeneralRe: I wonder who's the lamer who voted 1 Pin
umeca7413-Oct-02 0:24
umeca7413-Oct-02 0:24 
GeneralRe: I wonder who's the lamer who voted 1 Pin
Devpro AB5-Jun-03 1:40
Devpro AB5-Jun-03 1:40 
GeneralNice and... Pin
Nish Nishant12-Oct-02 15:10
sitebuilderNish Nishant12-Oct-02 15:10 
GeneralExcellent Pin
John O'Byrne12-Oct-02 9:55
John O'Byrne12-Oct-02 9:55 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.