Click here to Skip to main content
15,885,278 members
Articles / Desktop Programming / MFC
Article

URL/Web Addresses Logger

Rate me:
Please Sign up or sign in to vote.
4.76/5 (13 votes)
3 Nov 20064 min read 84.1K   4.5K   40   20
This application can be used to track web URLs of the current user and store them in a log file on the root or in any folder.

Sample Image

Introduction

First of all, I would like to say, English is not my native language, but I like to share here my knowledge about how to log web URLs in a hidden way. If I make any mistakes then apologies for it in advance. The main requirement for this article is the WinpCap 3.1 SDK, which is used to capture the network packets. This can help capture all kinds of packets, and using this, any one can log long links, short links, just domain names, HTTP links with picture formats like JPG or BMP etc. But this article will focus on ports 8080 and 1080 (any one can change it inside the code according to their requirements).

Background

This application is used to track all web URLs of the current user and store them in a log file on the root or in any folder. Those who would like to download the Winpcap libraries (SDK) should visit its website. Also, you might need the complete Win32 SDK. There are four main C++ files that are at the heart of this project, WinpcapSniffer.h, WinpcapSniffer.cpp, UrlSniffer.h, and UrlSniffer.cpp.

Using the code

The WinpcapSniffer class is the main class which handles the initialization and shutdown routines of the Winpcap functions. It is simply a wrapper class. The WinpcapSniffer ::InitializeWinpCapSniffer() function initializes the WinpCap adapter function and the filter functions.

int WinpcapSniffer::InitializeWinpCapSniffer()
{
    InitializeAdapter(2);
    InitializeFilter();

    return 1;
}
The constant “2” as the aurgument for InitializeAdapter(2); means to automatically select the second adapter information which gets the display on the screen. E.g., either the LAN card or the modem card adapter. They are actually a way to make a path to the internet, LAN etc. Most people have a LAN adapter at the second option just like I have. That’s why I have given the 2 to let it be automatically selected when the console gets initialized. InitializeFilter(); is a function which simply releases the resources, and initializes the filter to capture packets on specific ports. Before calling this, it is a must to set a filter string if you want to monitor specific ports. For that, just set the filter string inside the member variable m_strPacket_filter using a setter function SetFilterString(string). E.g., "tcp port 8080 or tcp port 1080" tells us to capture the packets specifically at 8080 and 1080. For more information about the WinpCap SDK, visit its website.

Now, we have the wrapper class, so we can use it for any purpose we want. As our requirement is to capture the URLs of websites, I have created another class called UrlSniffer, which should be inherited from WinpcapSniffer. Now, in the constructor of UrlSniffer, the filter function should be the first function to be called, and we should also set the string.

UrlSniffer::UrlSniffer()
{
    this->SetFilterString("tcp port 8080 or tcp port 1080");

}
Now, UrlSniffer seems logical as this class is just made only to watch the specific ports which Internet Explorer uses. (You can give other ports as well inside this string, just put an “or” between them; or for more explanations, try the Winpcap documents about filters.)
int UrlSniffer::InitializeUrlSniffer()
{
    InitializeWinpCapSniffer();

    pcap_loop(adhandle, 0,PacketHandler, NULL);

    return 1;
}
pcap_loop is the function which takes the user defined function as a handler for capturing packets. It takes four arguments in which adhandle is a structure which is defined in the parent class as pcap_t *adhandle;, and PacketHandler is the user defined static function which is responsible for receiving packets and then processing according to the user requirements.
/* Callback function invoked by libpcap for every incoming packet */
void PacketHandler(u_char *param, const struct pcap_pkthdr *header, 
                   const u_char *pkt_data)
{
    struct tm *ltime;
    ip_header *ih;
    
    /* retireve the position of the internet packet header */
    ih = (ip_header *) (pkt_data +  14); //length of ethernet header

    bool bFoundUrl=false;
    
    string data = UrlSniffer::FilterNetworkPacket(
                  (char*)pkt_data,(int)ih->tlen,bFoundUrl);

    if(bFoundUrl)
    {
        string Urldata = UrlSniffer::ExtractUrlOnly(data);

        g_urls.insert(Urldata);
        printf("%s\n %d",Urldata.c_str(),g_urls.size());
    }

}

The code is self explanatory; pkt_data is the buffer containing the packet data, and ih = (ip_header *) (pkt_data + 14); retrieves the length of the data .

bool bFoundUrl=false;
// write now we don’t know if the packet URL we found it or not

string data = UrlSniffer::FilterNetworkPacket(
              (char*)pkt_data,(int)ih->tlen,bFoundUrl);
FilterNetworkPacket gets the real thing which we want, as we know the GET request is used when we visit any site. This function simply checks that if the current packet request is GET, and if yes, then it filters the data with the ASCII readable words and sents back to the caller. If the function finds the GET request, then it changes the status of bFoundUrl to “true”. Here is the filter packet code for GET:
string UrlSniffer::FilterNetworkPacket(const char 
       *r_szDataToFilter,int iLen,bool &r_bFoundUrl)
{
    bool bIsUrlMsg=false,bPostMsg=false;
    int iCounter = 0;
    string strFiltered;

    for(int iLoop=1; iLoop < iLen; iLoop++)
    {                      
        
        if(!bIsUrlMsg)
        {
                
            if(r_szDataToFilter[iLoop-1]=='G'
               && r_szDataToFilter[iLoop]=='E'
               && r_szDataToFilter[iLoop+1]=='T' && 
               r_szDataToFilter[iLoop+2]==' '
             )
            {
                bIsUrlMsg = r_bFoundUrl = true;
                
            }


        }

        if(bIsUrlMsg)
        {
            if(RequiredData(r_szDataToFilter[iLoop-1]))
                strFiltered+=r_szDataToFilter[iLoop-1];
        }
    }
    return strFiltered;

}

The bool RequiredData(char) data function simply checks if the character is human readable data or not. If it finds garbage or non-ASCII type characters, then it returns false, otherwise true if the character is readable.

bool RequiredData(char c)
{

    bool flag  = false;
    
    if(isalnum(c) ||ispunct(c)||isspace(c))
    {
            flag = true;
    }
    
    return flag;
}

Inside the PacketHandler, the last few statements simply extract the URLs, like:

if(bFoundUrl)
{
    string Urldata = UrlSniffer::ExtractUrlOnly(data);

    //it has all address , links , pics whatever 
    //came on  Get request as a some sort 
    //of address with http
    g_urls.insert(Urldata);
    printf("%s\n %d",Urldata.c_str(),g_urls.size());
}

The g_urls is a list object, and it contains all the unique addresses that it has found, which could be domains, long names, file names, pictures etc., whatever was on the HTTP request of GET as a URL. UrlSniffer::ExtractUrlOnly simply checks if the current link is having .net, .com, or .org in the end of the URL. Then, it will simply create a log file at the root as an HTM file and writes those addresses. (We can write anything and any long name as well, but I am right now just retrieving simple names rather than long names which are all saved in g_urls.)

string UrlSniffer::ExtractUrlOnly(const string r_szDataToFilter)
{
    string strUrlData;

    int iGetEnd = r_szDataToFilter.find("HTTP/1.0");
    
    if(iGetEnd>-1)
    {
        strUrlData = r_szDataToFilter.substr(3,
                     (int)r_szDataToFilter.length());
        iGetEnd = strUrlData.find("HTTP/1.0");
        strUrlData = strUrlData.substr(0,iGetEnd);

        //now strUrlData is came in shape to be use as how u like
        
        int iEndPart =-1;
        string strEndpart = strUrlData.substr(strUrlData.length()-7,5);
        
        //I just need .com .net and .org kind site so 
        //I am putting few if statements
        if((iEndPart =(int) strEndpart.find(URL_PROPERADDRESS1)!=-1) || 
            (iEndPart =(int) strEndpart.find(URL_PROPERADDRESS2)!=-1)||
            (iEndPart =(int) strEndpart.find(URL_PROPERADDRESS3)!=-1)
            )
        {

            AddWebAddressLog(strUrlData); //this just maintain the log
        }
    }

    return strUrlData;
}

These are the macros:

#define URL_PROPERADDRESS1 ".com"
#define URL_PROPERADDRESS2 ".net"
#define URL_PROPERADDRESS3 ".org"

Conclusion

I don’t say that this is the only way to capture packets and maintain URL loggers, but this is another way to create a URL logger. This example can be used for a variety of different purposes. If I couldn’t explain this in a better way, then apologies in advance… I hope I shared this knowledge with beginners and gurus.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Web Developer
Pakistan Pakistan
My name is Farhan Hameed Khan , and i am currently working on security projects in karachi, Pakistan. I like to share my knowledge and researchs between all the programmers around the world. I have also done work on java,j2me etc ..and soon i hope i will move to Symbian and Micrsoft Mobile OS for their security projects.

I have done 4 years studies in computer science , and working since 2004 Aug in C++.
I have many goals to achieve and "if God's will" i hope one day i will achieve all of them.

My hobbies are related to creation new things,arts etc. Sometimes i make drawings,animation and concept designing are my childhood abilities.

Comments and Discussions

 
QuestionGreat article, but... Pin
Will Taylor18-Sep-13 10:15
Will Taylor18-Sep-13 10:15 
Generalbugs Pin
libotao13-Mar-11 19:17
libotao13-Mar-11 19:17 
GeneralProduces line numbers but no URLs Pin
keith_leitch16-Oct-10 20:09
keith_leitch16-Oct-10 20:09 
GeneralMy vote of 5 Pin
searockruz31-Jul-10 7:43
searockruz31-Jul-10 7:43 
GeneralMy vote of 5 Pin
Syed J Hashmi16-Jul-10 12:47
Syed J Hashmi16-Jul-10 12:47 
GeneralNot getting any o/p after executing this program. Pin
Subrat 470826620-Apr-10 22:41
Subrat 470826620-Apr-10 22:41 
Generalusing SharpPcap Pin
kaka sipahe7-Sep-09 21:41
kaka sipahe7-Sep-09 21:41 
GeneralWe need someone to write this program on windwos mobile Pin
BathSquirrel27-Apr-09 19:43
BathSquirrel27-Apr-09 19:43 
Questionhow to stop listening? Pin
aren37223-Jul-08 16:02
aren37223-Jul-08 16:02 
Questioncan not get data, only detect! Pin
gaoqing00012-Oct-07 22:22
gaoqing00012-Oct-07 22:22 
AnswerRe: can not get data, only detect! Pin
Jason( J.Zhang)25-Oct-07 1:17
Jason( J.Zhang)25-Oct-07 1:17 
GeneralListening But No Capture Data!!! Pin
roseking7720-Nov-06 6:55
roseking7720-Nov-06 6:55 
GeneralRe: Listening But No Capture Data!!! Pin
farhanx6-Dec-06 20:07
farhanx6-Dec-06 20:07 
Questionpcap.h : No such file or directory Pin
roseking7712-Nov-06 9:53
roseking7712-Nov-06 9:53 
AnswerRe: pcap.h : No such file or directory Pin
farhanx13-Nov-06 0:34
farhanx13-Nov-06 0:34 
GeneralThanks a lot!!! Pin
roseking7714-Nov-06 10:05
roseking7714-Nov-06 10:05 
QuestionQuestion Pin
UnRusoDeCaracas4-Nov-06 12:48
UnRusoDeCaracas4-Nov-06 12:48 
AnswerRe: Question Pin
farhanx10-Nov-06 2:48
farhanx10-Nov-06 2:48 
GeneralNice. Thanks Pin
bearfx3-Nov-06 10:05
bearfx3-Nov-06 10:05 
I was looking at doing this myself this weekend Wink | ;-) You saved me some of the hassle.

Thanks.
GeneralRe: Nice. Thanks Pin
farhanx10-Nov-06 2:49
farhanx10-Nov-06 2:49 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.