TCP Session Reconstruction Tool

Saar Yahalom

4.65/5 (17 votes)

21 Sep 2007CPOL6 min read

8.2K

A TCP session reconstruction tool for C#.

Introduction

This is a C# utility for reconstructing sniffer captured TCP sessions (even incomplete). This is based on libnids and a translated part of WireShark. Not being able to find such a solution, I had to build one myself.

I was looking for some tools which could reconstruct a TCP session from a Pcap file. The tools I have found were mostly for Linux, or robust GUI tools like WireShark. So, I decided to build my own tool. It is time that the C# community will have a TCP session reconstruction tool.

Introduction
Background

What is a Sniffer?
TCP session reconstruction
What is it good for?

First try using libnids
Still not good enough
Second try, hacking WireShark

Capturing the packets
The final design
Making it work together

How to use this tool
Additional information
Thanks
History

Background

What is a Sniffer?

"A sniffer is a piece of software that grabs all of the traffic flowing into and out of a computer attached to a network. They are available for several platforms in both commercial and open-source variations. Some of the simplest packages are actually quite easy to implement in C or Perl; use a command line interface and dump the captured data to the screen. More complex projects use a GUI, graph traffic statistics, track multiple sessions, and offer several configuration options. Sniffers are also the engines for other programs. Intrusion Detection Systems (IDS) use sniffers to match packets against a rule-set designed to flag anything malicious or strange. Network utilization and monitoring programs often use sniffers to gather data necessary for metrics and analysis. Law enforcement agencies that need to monitor emails during investigations likely employ a sniffer designed to capture very specific traffic."

Excerpt from - Sniffers: What They Are and How to Protect Yourself

TCP session reconstruction

A sniffer lets you capture packets on your network, but the packets come in various shapes and colors, and they are often in disorder. TCP reconstruction is the reordering of the session data back to its original state.

What is it good for?

In a sentence, it lets you watch the application layer data. Let's say, you have a capture file obtained from a program such as tcpdump or WireShark. Now, you want to observe the actual sessions that you captured, for example, a full HTTP session between a server and a client. You can reconstruct the page requests and the HTTP server responses. Yes, this means you can also actually reconstruct emails and attachments that are sent over your network.

I use it in a security tool I'm working on. You can use this kind of functionality to do good or evil, but it is a powerful functionality that is missing.

First try using libnids

libnids is a library designed by Rafal Wojtczuk. It emulates the IP stack of Linux 2.0.x. libnids offers IP defragmentation, TCP stream assembly, and TCP port scan detection. There are some ports of the library to Win32, the most recent one I have found is for version 1.19 (the newest Linux version is 1.22). You will also need the WinPcap library in order to build it. The solution here is to wrap this library with managed C++ code and provide two managed callbacks, one for the server data and one for the client data.

How to work with libnids:

The first thing we have to do is to configure the library nids_params struct. I needed to work on a capture file, this is the place to pass the file name. You can set a lot of options here, including a filter which can be very useful if you have a big data file, or you decide to run live on the wire.
Call nids_init().
Register a local non managed callback for the library.
Setup the two managed callbacks.
Call nids_run().

C++

// set the pcap file name in the nids param structure
IntPtr p = Marshal::StringToHGlobalAnsi(filename);
char *szFile = (char*)p.ToPointer();
nids_params.filename = szFile;
.
.
.
// register the managed callbacks
managedLibnids::LibnidsWrapper::m_clientCallback = clientCallback;
managedLibnids::LibnidsWrapper::m_serverCallback = serverCallback;
// init libnids
if (!nids_init ())
{
    return;
}
// register the local callback
nids_register_tcp (tcp_callback);

// start the capture ...
nids_run ();

Simple as that. Of course, now you need to marshal data off to the managed callbacks, but besides that, you are basically done.

This is the interesting part of the local callback function:

void managedLibnids::tcp_callback (struct tcp_stream *a_tcp, 
                                   void ** this_time_not_needed)
{
  if (a_tcp->nids_state == NIDS_JUST_EST)
    {
    // connection described by a_tcp is established
    // here we decide, if we wish to follow this stream
    // sample condition: if (a_tcp->addr.dest!=23) return;
    // in this simple app we follow each stream, so..
      a_tcp->client.collect++; // we want data received by a client
      a_tcp->server.collect++; // and by a server, too
      a_tcp->server.collect_urg++; // we want urgent data received by a
                                   // server
      a_tcp->client.collect_urg++; // if we don't increase this value,
                                   // we won't be notified of urgent data
                                   // arrival
      return;
    }
.
.
.
  if (a_tcp->nids_state == NIDS_DATA)
    {
      // new data has arrived; gotta determine in what direction
      // and if it's urgent or not

      struct half_stream *hlf;

      // So, we have some normal data to take care of.
      if (a_tcp->client.count_new)
      {
          // new data for client
          hlf = &a_tcp->client; // from now on, we will deal with hlf var,
                                // which will point to client side of conn
      }
      else
      {
        hlf = &a_tcp->server; // analogical
      }
      //we send the newly arrived data
      int len = hlf->count_new;
      // Marshal the data
      array[byte]^ data = gcnew array[byte](len);
      Marshal::Copy((IntPtr)hlf->data,data,0, len);
      // Send the data to the callback
      if (a_tcp->client.count_new)
        LibnidsWrapper::m_clientCallback(data,a_tcp->addr.saddr,
          a_tcp->addr.source,a_tcp->addr.daddr,a_tcp->addr.dest,false);    
      else
         LibnidsWrapper::m_serverCallback(data,a_tcp->addr.saddr,
           a_tcp->addr.source,a_tcp->addr.daddr,a_tcp->addr.dest,false);    
    }
  return ;
}

Still not good enough

I have discovered that only complete or nearly complete sessions get reassembled with libnids. This is usually enough for most people, but I need to check incomplete sessions as well. I usually work with WireShark on specific streams, and it has the ability to follow TCP streams, even incomplete streams. This is the ability I want to have as well.

Second try, hacking WireShark

Luckily, WireShark is an open source project. I stripped it open, and searched for the code that reconstructs the TCP session, and Walla! From here, all I had to do was to translate the ANSI C code to pure C#. The code was very intuitive and easy to follow, thanks for the guys at WireShark that did a good job.

This is the main function that reconstructs the TCP session:

private void reassemble_tcp( ulong sequence, ulong length, byte[] data,
               ulong data_length, bool synflag, long net_src,
               long net_dst, uint srcport, uint dstport)
{
   long srcx, dstx;
   int src_index, j;
   bool first = false;
   ulong newseq;
   tcp_frag tmp_frag;

   src_index = -1;

   /* Now check if the packet is for this connection. */
    srcx = net_src;
    dstx = net_dst;

   /* Check to see if we have seen this source IP and port before.
   (Yes, we have to check both source IP and port; the connection
   might be between two different ports on the same machine.) */
   for( j=0; j<2; j++ ) {
       if (src_addr[j] == srcx && src_port[j] == srcport ) {
           src_index = j;
       }
   }
   /* we didn't find it if src_index == -1 */
   if( src_index < 0 ) {
       /* assign it to a src_index and get going */
       for( j=0; j<2; j++ ) {
           if( src_port[j] == 0 ) {
               src_addr[j] = srcx;
               src_port[j] = srcport;
               src_index = j;
               first = true;
               break;
           }
       }
   }
   if( src_index < 0 ) {
       throw new Exception("ERROR in reassemble_tcp: Too many addresses!");
   }

   if( data_length < length ) {
       incomplete_tcp_stream = true;
   }

   /* now that we have filed away the srcs, lets get the sequence number stuff
   figured out */
   if( first ) {
       /* this is the first time we have seen this src's sequence number */
       seq[src_index] = sequence + length;
       if( synflag ) {
           seq[src_index]++;
       }
       /* write out the packet data */
       write_packet_data( src_index, data );
       return;
   }
   /* if we are here, we have already seen this src, let's
   try and figure out if this packet is in the right place */
   if( sequence < seq[src_index] ) {
       /* this sequence number seems dated, but
       check the end to make sure it has no more
       info than we have already seen */
       newseq = sequence + length;
       if( newseq > seq[src_index] ) {
           ulong new_len;

           /* this one has more than we have seen. let's get the
           payload that we have not seen. */

           new_len = seq[src_index] - sequence;

           if ( data_length <= new_len ) {
               data = null;
               data_length = 0;
               incomplete_tcp_stream = true;
           } else {
               data_length -= new_len;
               byte[] tmpData = new byte[data_length];
               for(ulong i=0; i<data_length; /> 0 && sequence > seq[src_index] ) {
           tmp_frag = new tcp_frag();
           tmp_frag.data = data;
           tmp_frag.seq = sequence;
           tmp_frag.len = length;
           tmp_frag.data_len = data_length;

           if( frags[src_index] != null ) {
               tmp_frag.next = frags[src_index];
           } else {
               tmp_frag.next = null;
           }
           frags[src_index] = tmp_frag;
       }
   }
} /* end reassemble_tcp */

O.K. We can reconstruct a session; now, all we need is to capture the packets and store each session separately.

Capturing the packets

For this, I have used a great library I have found here at CodeProject, which is called SharpPcap, written by Tamir Gal. For more info, you can check out this link on how to work with SharpPcap.

Here is the code I have used to capture TCP packets:

try
{
    //Get an offline file pcap device
    device = SharpPcap.GetPcapOfflineDevice(capFile);
    //Open the device for capturing
    device.PcapOpen();
}
catch (Exception e)
{
    Console.WriteLine(e.Message);
    return;
}

//Register our handler function to the 'packet arrival' event
device.PcapOnPacketArrival +=
    new SharpPcap.PacketArrivalEvent(device_PcapOnPacketArrival);

// add a filter so we get only tcp packets
device.PcapSetFilter("tcp");

//Start capture 'INFINTE' number of packets
//This method will return when EOF reached.
device.PcapCapture(SharpPcap.INFINITE);

//Close the pcap device
device.PcapClose();

The final design

The design I used is a dictionary object that holds pairs of (Connection, TcpRecon) objects. TcpRecon holds a FileStream and the state of the TCP connection. It is responsible for the actual reconstruction of a TCP session and the storing of the session in its FileStream. Connection holds session information such as source IP, destination IP, source port, destination port, etc.

Making it work together

The SharpPcap library callback searches for a matching connection in the dictionary. Next, the packet is fed to the correct TcpRecon, which in turn reconstructs the session.

// The callback function for the SharpPcap library
private static void device_PcapOnPacketArrival(object sender, Packet packet)
{
    TCPPacket tcpPacket = (TCPPacket)packet;
    // Creates a key for the dictionary
    Connection c = new Connection(tcpPacket);

    // create a new entry if the key does not exists
    if (!sharpPcapDict.ContainsKey(c))
    {
        string fileName = c.getFileName(path);
        TcpRecon tcpRecon = new TcpRecon(fileName);
        sharpPcapDict.Add(c, tcpRecon);
    }

    // Use the TcpRecon class to reconstruct the session
    sharpPcapDict[c].ReassemblePacket(tcpPacket);
}

How to use this tool

Note that you must have WinPcap installed in order to run TcpRecon!

TcpRecon <capture file name> <-nids>
The -nids flag is used to activate the libnids reconstruction 
    instead of the built in functionality.
The tool will create session files of the form:
    <server_ip>.<server_port>-<client_ip>.<client_port>.data

Additional information

Libnids project[^]
WireShark project[^]
WinPcap[^] - A library that most sniffers rely on.
Windump[^] - A command line sniffer tool for Windows.
SharpPcap project[^]

Thanks

I would like to thank all the wonderful people behind the projects I've used to build this tool. Thank you for sharing.

History

Version 1 - First Submission
Version 1.01 - Fixed handling of none TCP packets.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

TCP Session Reconstruction Tool

Contents

License