Click here to Skip to main content
12,625,402 members (37,530 online)
Click here to Skip to main content
Add your own
alternative version

Stats

30.8K views
555 downloads
17 bookmarked
Posted

Identify User Machine without using Cookies

, 2 Sep 2007 CPOL
Rate this:
Please Sign up or sign in to vote.
Identify User Machine without using Cookies

Screenshot - IdentifyMachine.jpg

Introduction

Some days ago, I was given an assignment which includes finding an extraordinary way to identify user machine without using cookies or IP address. When the user visits the web application for the first time, the user will have to receive "No, you have not connected before and are a new user" message, else if it is other than the first visit, the user will have to receive "Yes, you have connected before". So I studied the HTTP/1.0 protocol to have some clue and came up with the idea of identifying machine based on cache information as explained below.

Background

The communication between web browser (HTTP application client) and web server (HTTP application server) takes place in the form of requests and response. A request is sent from web browser to web server and a response is returned from web server to web browser. Each request or response is composed of two parts, headers and data. Headers give extra information and settings about the request or response. The data part contains the actual data intended.

Most of the browsers cache the internet data they receives from the web server. So, whenever the user again visits the content she/he has visited earlier, the web browser asks the web server whether the content has changed since the last time it received from it, and sends a time stamp of the last visit along with the "Request of the Content" to web server, this is called Conditional Get. The web server would return the contents only if it is modified since the time stamp it received from the browser. In this way, browsing is accelerated, as there is no need for downloading cached data, if it's not modified. In the process of delivering content to the browser, the web server may tell the web browser the last modified date of contents, if it does so, the web browsers (famous browser such as MSIE, and Mozilla Firefox) return the same time stamp to web server to query whether the content is modified since that time.

The last modified date of content is sent from the web server to the web browser in Last-Modified header. In the second visit, the web browser requests the content by specifying the last modified time stamp in If-Modified-Since header.

Using the Code

This is a small program consisting of static functions. The code is explained below briefly. The following system namespaces are used:

using System;
using System.Collections.Generic;
using System.Net.Sockets;
using System.Net;
using System.IO;

System.Collections.Generic namespace is used to have Dictionary objects. System.Net and System.Net.Socket are used to implement TCP/IP networking features. System.IO is used to have TextReader and TextWriter features.

The main procedure is described below:

public static Dictionary<string, string> headers = null;
static void Main(string[] args)
{
    TcpListener listener = null;
    try
    {
        IPAddress address = Dns.GetHostEntry( Dns.GetHostName() ).AddressList[0];
        Console.WriteLine("Info: server start at IP: " + address + " Port: 80");
        listener = new TcpListener(address, 80);
        listener.Start();
        while (true)
        {
            try
            {
                Socket conn = listener.AcceptSocket();
                Console.WriteLine("*************************************");
                Console.WriteLine("Info: Connection established, 
		Connected to IP: " + ((IPEndPoint)conn.RemoteEndPoint).Address +
                    " Port: " + ((IPEndPoint)conn.RemoteEndPoint).Port);
                Console.WriteLine("*************************************");
                NetworkStream stream = new NetworkStream(conn);
                TextReader reader = new StreamReader(stream);
                TextWriter writer = new StreamWriter(stream);
                if (ParseRequest(reader))
                {
                    if ("GET" == method)
                    {
                        headers = new Dictionary<string, string>();
                        while (ReadNParseHeader(reader)) ;
                        if ("/" == resource)
                        {
                            SendHTMLIdentifyUser(writer);
                        }
                        else
                        {
                            Console.WriteLine("Warning: Invalid resource: \"" 
				+ resource + "\" requested");
                        }
                    }
                    else
                    {
                        Console.WriteLine
			("Warning: Only GET method supported, Closing connection");
                    }
                }
                
                Console.WriteLine("Info: Closing Connection Successfully");
                Console.WriteLine("-------------------------------------");
                writer.Close();
                reader.Close();
                stream.Close();
                conn.Close();
            }
            catch (Exception exception)
            {
                Console.WriteLine("Warning: " + exception.Message);
            }
        }
    }
    catch (Exception exception)
    {
        Console.WriteLine("ERROR: " + exception.Message);
    }
    finally
    {
        if (null != listener)
        {
            Console.WriteLine("Info: Stopping listener");
            listener.Stop();
            listener = null;
        }
    }
    Console.WriteLine("Program Ended, Press ENTER to exit");
    Console.ReadLine();
}

'header' object is of Dictionary type having string for both its keys and values; it is used later in the program.

First of all, the IP address is obtained and along with port it is printed on console, to let the user know on which socket the server is listening (to avoid confusion, in case machine has more than one IP assigned to it). Then listener at that socket is created and started. Then the remote socket is obtained to serve it. Then TextReader and TextWriter objects are created to communicate over the network using streams of text; as majority of HTTP communication is usually plain text.

The first element is the basic request from HTTP client (web browser). This includes the method, content identifier and location, and HTTP version. The ParseRequest procedure parses this basic request and puts the method in method, resource in resource and protocol in httpProtocol static string objects.

There are three basic types of requests to HTTP server: GET HEAD and PUT. Here, the simple program only supports GET request.

While serving the remote socket, first the request headers are read and parsed using ReadNParseHeader, which places each header's title and value in header dictionary object. The "/" resource specifies the default content, which is only supported here. Then HTML is sent to client using the SendHTMLIdentifyUser procedure.

ParseRequest procedure is described below:

public static string method,
                resourceLoc,
                resource,
                queryString,
                httpProtocol; // HTTP/1.0, always assuming it
private static bool ParseRequest(TextReader reader)
{
    string request = ReadUntilCRLF(reader);
    Console.WriteLine("Info: Request received: \"" + request + "\"");
    string[] tokens = request.Split(new string[] { " " }, 
		StringSplitOptions.RemoveEmptyEntries);
    if (3 != tokens.Length)
    {
        Console.WriteLine("Warning: Request must split in 3 tokens");
        return false;
    }
    // method
    method = tokens[0].ToUpper();
    // query string
    queryString = "";
    int indexEnd = tokens[1].IndexOf('?');
    if (indexEnd < 0)
    {
        indexEnd = tokens[1].Length;
    }
    else
    {
        queryString = tokens[1].Substring(indexEnd, tokens[1].Length - indexEnd);
    }
    // resource
    int indexLastSeperator = tokens[1].LastIndexOf('/');
    int resLen = indexEnd - indexLastSeperator;
    resource = tokens[1].Substring(indexLastSeperator, resLen);
    // resourceLocation
    if (0 == tokens[1].ToLower().IndexOf("http://")) // absolute path in request
    {
        int indexSeperator = tokens[1].IndexOf('/', 7); // http:// are 7 chars
        resourceLoc = tokens[1].Substring(indexLastSeperator, 
		indexEnd - indexLastSeperator - resLen);
    }
    else
    {
        resourceLoc = tokens[1].Substring(0, indexEnd - resLen);
    }
    // protocol
    httpProtocol = tokens[2].ToUpper();
    Console.WriteLine("Info: Method: " + method);
    Console.WriteLine("Info: Resource Location: " + resourceLoc);
    Console.WriteLine("Info: Resource: " + resource);
    Console.WriteLine("Info: Query String: " + queryString);
    Console.WriteLine("Info: Protocol: " + httpProtocol);
    return true;
}

As I said earlier, the first entity delivered from the web browser to the web server is a basic request which consists of request method, resource identifier and HTTP version. Other things in query string may include resource location and query string. The resource location can be relative or absolute. ParseRequest function simply separates this information and stores it in respective static string objects, i.e. method, resourceLoc, resource, queryString and httpProtocol.

ReadNParseHeader procedure is described below:

private static bool ReadNParseHeader(TextReader reader)
{
    string header = ReadUntilCRLF(reader);
    if (header.Length > 0)
    {
        Console.WriteLine("Info: Header received: \"" + header + "\"");
        string[] tokens = header.Split(new string[] { ": " }, 
		StringSplitOptions.RemoveEmptyEntries);
        if (tokens.Length == 2)
        {
            headers.Add(tokens[0].ToUpper(), tokens[1]);
        }
        else
        {
            Console.WriteLine("Warning: Cannot Parse header");
        }
        return true; // headers follow
    }
    else
    {
        return false; // end of headers
    }
}

HTTP request and response headers follow a strict format. Each header consists of header title followed by colon ':' followed by space followed by value of that header. Each header is terminated by carriage return and line feed i.e.'\r\n'. End of header portion is specified by extra carriage return and line feed after last header. ReadNParseHeader parses each header and stores the header title and header value in dictionary. This procedure returns true if more headers follow, else it returns false.

ParseRequest and ReadNParseHeader procedures use ReadUntilCRLF procedure described below:

private static string ReadUntilCRLF(TextReader reader)
{
    string strLine = "";
    char prevChar = '\0',
            currChar = (char)reader.Read();
    while (!('\r' == prevChar && '\n' == currChar))
    {
        strLine += currChar;
        prevChar = currChar;
        currChar = (char)reader.Read();
    }
    strLine = strLine.Substring(0, strLine.Length - 1); // remove prevChar = "\r"
    return strLine;
}

This function reads the text stream character by character until it finds carriage return and line feed. It returns the string till before the sentinel values ('\r\n').

The actual function which causes the real trick for identifying the machine is SendHTMLIdentifyUser; it is described below:

private static void SendHTMLIdentifyUser(TextWriter writer)
{
    string html = "<HTML><BODY>Hello! How are you? ";
    bool userNew = true;
    string keyIfModifiedSince = "IF-MODIFIED-SINCE";
    foreach (string key in headers.Keys)
    {
        if (key == keyIfModifiedSince)
        {
            userNew = false;
            break;
        }
    }
    int currentId = -1;
    string lastModified = DateTime.Now.ToString("R");
    writer.Write("HTTP/1.0 200 OK\r\n");
    writer.Write("Content-Type: text/HTML\r\n");
    writer.Write("Last-Modified: " + lastModified + "\r\n");
    string strIden = "";
    if (userNew)
    {
        currentId = machineId++;
        strIden = "No, you have not connected before and are a new user";
    }
    else
    {
        string lastDate = headers[keyIfModifiedSince];
        int indexSep = lastDate.IndexOf(';');
        if (indexSep < 0)
        {
            indexSep = lastDate.Length;
        }
        lastDate = lastDate.Substring(0, indexSep);
        try
        {
            currentId = machineIdentification[lastDate];
            machineIdentification.Remove(lastDate);
            strIden = "Yes, you have connected before";
        }
        catch (Exception)
        {
            currentId = machineId++;
            strIden = "No, you have not connected before and are a new user";
        }
    }
    html += strIden + "</BODY></HTML>";
    machineIdentification.Add(lastModified, currentId);
    writer.Write("Content-Length: " + html.Length + "\r\n");
    writer.Write("\r\n");
    writer.Write(html);
    Console.WriteLine("Info: Machine Id: " + currentId);
}

The trick applied in the above function is that first of all, the IF-MODIFIED-SINCE header is search in header fields, its presence means that user has already visited before; on the other hand its absence means that the user is a first time visitor. If the user has already visited the site, the time stamp in IF-MODIFIED-SINCE request header will help us to get user profile (in our case user ID); else if user is a first time visitor, we have to create a new profile for the user (in our case, new user ID). Obtain the current time stamp, and map the user profile with this time stamp, send the time stamp to user in Last-Modified header; along with the user requested resource. Thus user can be identified with the help of IF-MODIFIED-SINCE request header and Last-Modified response header.

Points of Interest

It is worth noting that MSIE also sends the last content length received with the time stamp to the web server.

History

  • 2nd September, 2007: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Anees Haider
Pakistan Pakistan
No Biography provided

You may also be interested in...

Pro
Pro

Comments and Discussions

 
GeneralAn advise Pin
Eduard Gomolyako2-Sep-07 22:58
memberEduard Gomolyako2-Sep-07 22:58 
GeneralRe: An advise Pin
Anees Haider20-Oct-07 8:55
memberAnees Haider20-Oct-07 8:55 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.161128.1 | Last Updated 3 Sep 2007
Article Copyright 2007 by Anees Haider
Everything else Copyright © CodeProject, 1999-2016
Layout: fixed | fluid