Click here to Skip to main content
15,881,852 members
Articles / Desktop Programming / MFC

HTML Filter to Ban Pop Ups of All Kinds

Rate me:
Please Sign up or sign in to vote.
4.39/5 (19 votes)
11 Oct 20024 min read 126.5K   1.8K   47   35
Free tool to close pop up windows
This tool acts as a proxy server, and basically implements a double-threaded socket line.

Image 1

Filtering out HTML content

Introduction

HTML Filter is a tool I wrote after being fed up with pop up windows of all kinds.

Applying HTML filtering to automatically close pop up windows, is of course quite an effective application. But that's not the only one, which actually brings a lot more interest to the technique.

HTML Filter?

I have spent much time trying to close pop ups automatically by using "external tools" such as this one. External tools check out open windows on a regular time basis against a known dictionary of banned window names. This works fine, as long as you're happy with being forced to add new entries every other day since ad names keep changing all the time.

So I had to find out a more internal way of doing it. I found that, after trying to work along with IE a couple of times already, there were many limits and weird things happening there with subscribed events, which for any reason sometimes don't trigger at all. I thought I had to find something more radical, and less coupled with the navigator I was using.

I finally successfully came up with a proxy filter, a systray tool which, once configured, sends back and forth, every HTTP packet, with the unique opportunity of seeing the HTML content itself.

This opportunity is great. Applying predefined filtering rules allows for instance to remove all kinds of nasty JavaScript known for bringing pop ups on screen, you know that nasty window.open (url, "xyz", ...) things.

Configuring the Tool

Once installed, it starts listening on the default 8010 port. If you are already using this port, change it, that's what the dialog box is for. Of course, you must let the navigator know that you are listening there, so let's open the Windows control panel, then double-click on Internet Options. In the Connections tab, just edit the Proxy Settings, click on Advanced, and type 127.0.0.1 in front of HTTP Proxy address to use - Server field, and type 8010 in the Port field. Press "Apply". Ok, you're done. You can go back and surf the web as you previously did, without notable changes (at least on surface).

If you are using Netscape or even Opera, just change the proxy settings using a similar procedure. For Netscape, go in the Edit / Preferences, then in Advanced / Proxy, and edit the HTTP Proxy field.

The filter is automatically activated, which means the HTML content going through it is filtered, and rules are applied. The source code provided filters window.open statements, replacing them with faked //ndow.open and it is up to you to add any other relevant rules in the CHtmlFilterRules class implementation. To disable filtering, just right-click in the systray and choose the option.

I also wanted the tool not to slow down the surfing experience. This goal is achieved by using simple sockets instead of MFC wrappers such like CAsyncSocket (which in turn mess around a lot with the _afxThreadSockState mess).

Technical Details

This tool acts as a proxy server. It basically implements a double-threaded socket line. The code is based on Nish's pop proxy server. How things work is depicted below:

Image 2

How the HTML filter works

The main class is declared as below:

C++
class CHttpProxyMT
{
    // Members
    //
protected:
    SOCKET m_HttpServerSocket;
    HANDLE m_ServerThread;
    int m_port;
    BOOL bRunning; // Flag that's set if the proxy is running

    // Constructor
    //
public:
    CHttpProxyMT();
    virtual ~CHttpProxyMT();

    // Methods
    //
public :
    // This starts the multi-threaded HTTP proxy server
    BOOL StartProxy(int port);
    BOOL IsRunning();
    void StopProxy();

    int GetProxyPort();
    int GetNBConnections();
    void EnableFiltering(BOOL bEnable=TRUE);

private:
    // The thread that listens for connections
    static DWORD ServerThread(void *arg); // thread callback
    DWORD MServerThread();

    static DWORD ClientThread(DWORD arg); // thread callback
    void StartClientThread(SOCKET sock);

    static void StartDataThread(void *parm); // thread callback
    static DWORD DataThread(void *parm); // thread callback
};

struct socket_pair
{
    socket_pair(SOCKET s1, SOCKET s2, BOOL bIsServerResponse)
    {
        srcsock = s1;
        dstsock = s2;
        m_bIsServerResponse = bIsServerResponse;
        n = 0;
    }

    SOCKET srcsock;
    SOCKET dstsock;
    BOOL m_bIsServerResponse;
    int n;
    char buff[16384+1];
};

What's funny is when you start working with threads, suddenly, everything comes so messed up. Indeed, every variable is under the potential fire of being accessed by several threads at the same time, making it just harder to code practically anything. I ended up associating a socket pair instance to each thread and basically referring to this object in every line of code, so to make sure I was sort of thread-safe. But it sucks, what one needs at this particular moment is an easy framework to attach variables and maps to the running thread. It becomes so amazing just because under Win32 the thread callback is a static (read global) function, thus used and reused by each thread.

In the end, I have code like this when it comes to sending server responses back to the client:

C++
DWORD CHttpProxyMT::DataThread(void *parm)
{
    socket_pair* spair = (socket_pair*) parm;

    // recv bytes from server and send 
    //them back to the client, once filtered
    while( (spair->n=recv(spair->srcsock, spair->buff, 16384, 0))>0 )
    {
        spair->buff[spair->n] = 0;

        if (g_bFilteringEnabled && spair->m_bIsServerResponse)
        {
            CHtmlFilterRules filter( spair->buff,spair->n );
            filter.ApplyRules();
        }

        send(spair->dstsock, spair->buff, spair->n, 0);
    }

    ...
}

Applying rules is up to what you intend to do. Basically, I wanted to comment out pop up window JavaScript code, but virtually the concept can be used for many other applications. Using filtering to forbid pop up windows comes as a consequence to the fact that, in HTML, the way to open pop up windows is through the window.open JavaScript command. Commenting out this line makes it KO, which is what we are looking for. Here is the code for it:

C++
CHtmlFilterRules::CHtmlFilterRules(char *buffer, int nLength)
{
    m_cpBuffer = buffer;
    m_nLength = nLength;
}

BOOL CHtmlFilterRules::ApplyRules()
{
    // we are already done!
    if (!m_cpBuffer || !m_nLength) return FALSE;

    // copy the buffer, in order to be able to 
    //compare the strings regardless of the case
    char *buf = new char[m_nLength+1];
    if (!buf) return FALSE;
    memcpy(buf, m_cpBuffer, m_nLength);

    buf[m_nLength]=0; // force EOL to allow str C-routines to work
    strlwr(buf); // convert to lowercase (CPU time here)

    char *szPattern = buf;
    char *szFirstByte = buf;
    while ( (szPattern=strstr(szPattern,"window.open"))!=NULL )
    {
        // replace window.open by //ndow.open 
        // so the javascript code doesn't create any annoying popup
        m_cpBuffer[szPattern-szFirstByte+0] = '/';
        m_cpBuffer[szPattern-szFirstByte+1] = '/';

        szPattern++;
    }

    delete [] buf;

    return TRUE;
}

Code Listing (Both VC6 and VC7 Workspaces Provided)

  • HtmlFilterRules.cpp: HTML filter
  • HttpProxyMT.cpp: based on Nish's PopProxyMT multi-threaded POP proxy server
  • htmlfilterdlg.cpp: port configuration, menu commands
  • htmlfilter.cpp: Win app
  • TrayNot.cpp: simple systray implementation

History

  • 12th October, 2002: Initial version

License

This article has no explicit license attached to it, but may contain usage terms in the article text or the download files themselves. If in doubt, please contact the author via the discussion board below.

A list of licenses authors might use can be found here.


Written By
France France
Addicted to reverse engineering. At work, I am developing business intelligence software in a team of smart people (independent software vendor).

Need a fast Excel generation component? Try xlsgen.

Comments and Discussions

 
GeneralUpdated now as a commercial product Pin
Stephane Rodriguez.13-May-06 1:52
Stephane Rodriguez.13-May-06 1:52 
GeneralGood JOB~~~ Pin
fishacker28-Jun-05 15:50
fishacker28-Jun-05 15:50 
GeneralI wait for you so long, so I........( une long temps) Pin
baoiph18-Apr-04 15:36
baoiph18-Apr-04 15:36 
GeneralRe: I wait for you so long, so I........( une long temps) Pin
Anonymous one18-Apr-04 19:12
sussAnonymous one18-Apr-04 19:12 
GeneralRe: I wait for you so long, so I........( une long temps) Pin
baoiph18-Apr-04 21:26
baoiph18-Apr-04 21:26 
GeneralRe: I wait for you so long, so I........( une long temps) Pin
Stephane Rodriguez.19-Apr-04 6:12
Stephane Rodriguez.19-Apr-04 6:12 
GeneralYour email? Stephane Rodriguez Pin
baoiph17-Apr-04 16:28
baoiph17-Apr-04 16:28 
Generalbuffer question Pin
alex.barylski26-Feb-04 21:38
alex.barylski26-Feb-04 21:38 
GeneralRe: buffer question Pin
Stephane Rodriguez.26-Feb-04 22:39
Stephane Rodriguez.26-Feb-04 22:39 
GeneralRe: buffer question Pin
alex.barylski27-Feb-04 11:06
alex.barylski27-Feb-04 11:06 
Generalexample site of missing image.. Pin
Paras16-Feb-04 8:27
Paras16-Feb-04 8:27 
GeneralProblem with dialup Pin
Paras16-Feb-04 8:19
Paras16-Feb-04 8:19 
Hi, i was looking at ur program on dialup. But on some of the sites using HTTP 1.1, it doesnt work well. I get the problem of missing images on those sites.

Any specific reason for that?
QuestionCan the Port settings be done programatically Pin
Monty212-Jan-04 1:16
Monty212-Jan-04 1:16 
AnswerRe: Can the Port settings be done programatically Pin
Stephane Rodriguez.12-Jan-04 2:34
Stephane Rodriguez.12-Jan-04 2:34 
GeneralBehind Proxies Pin
Monty212-Jan-04 0:16
Monty212-Jan-04 0:16 
GeneralRe: Behind Proxies Pin
micutzu25-Jan-04 23:49
micutzu25-Jan-04 23:49 
Generaldoesn't seem to work in XP Pin
daveandstuff1-Jul-03 15:57
daveandstuff1-Jul-03 15:57 
GeneralRe: doesn't seem to work in XP Pin
maqsoodr3-Jul-03 22:42
maqsoodr3-Jul-03 22:42 
GeneralRe: doesn't seem to work in XP Pin
Anonymous25-May-04 18:55
Anonymous25-May-04 18:55 
GeneralRe: doesn't seem to work in XP Pin
Stephane Rodriguez.25-May-04 19:04
Stephane Rodriguez.25-May-04 19:04 
GeneralInvalid Request Pin
waelsalah218-Jan-03 1:56
waelsalah218-Jan-03 1:56 
GeneralRe: Invalid Request Pin
Stephane Rodriguez.18-Jan-03 2:13
Stephane Rodriguez.18-Jan-03 2:13 
GeneralVery good !! Pin
philippe dykmans30-Oct-02 6:01
philippe dykmans30-Oct-02 6:01 
GeneralRe: Very good !! Pin
Stephane Rodriguez.30-Oct-02 6:05
Stephane Rodriguez.30-Oct-02 6:05 
Questionextra Feature ? Pin
Flow de15-Oct-02 6:43
Flow de15-Oct-02 6:43 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.