Click here to Skip to main content
15,867,308 members
Articles / Programming Languages / Ruby
Article

A Simple Taint Checking Solution for C#

Rate me:
Please Sign up or sign in to vote.
4.91/5 (13 votes)
16 Mar 2011CPOL12 min read 51.8K   513   23   10
We propose a way to secure C# programs by emulating Taint checking

Introduction

In this article, we propose a way to secure C# programs by enforcing the verification of potentially dangerous data from the outside world through a simple, Ruby-like solution that will allow a developer to "taint" a C# object by encapsulating it into a generic container class that will not allow access to the target object unless an "untaint" method is invoked on the container first, i.e., the object is deemed safe for use in a vulnerable environment. Defining what conditions allows the object to be cleared is left to the discretion of the software engineering practitioner and can be repeated if the data represents a threat to more than one part of a C# program.

Data from the Outside World Considered Harmful

Accepting data entry into an application (or simply using data from outside the application) is a dangerous operation, as an attacker can take advantage of improper handling to take advantage of or penetrate a system. For example, C and C++ programs are vulnerable to stack overflow attacks that take advantage of unsecured array bound functions: if there are no array bound checks, it will be relatively easy to send more data to an array that it expects (and was designed to handle), which can lead to the execution of arbitrary code provided by the attacker.

Applications that use an SQL database for data persistence face a different, yet potentially destructive issue: SQL injection attacks. SQL injection consists of an attacker including a SQL statement instead of or part of a data input that the program expects and requires from the user, such as an username. In this case, this input would most likely be used as a string to complete a predefined (legitimate) SQL statement that will fetch the user from the database. That possibility raises two issues. First, the data provided from an untrusted source has to be identified as a potential threat given its origin. Secondly, clearing untrusted data to be safe has to be done from the perspective of code that will use it, i.e., only the code that will require the untrusted data can know its own weaknesses and if it can use the data safely.

An Example of SQL Injection

This article is not intended, by any means, to provide a thorough description of SQL injection. This section is only meant to be an introduction to that threat in order to understand how our solution works, and thus can be skipped if the reader is familiar with the former. SQL injection can be easily explained through a (classic) example, that we show below. It features a typical C# database broker part of an authentication routine that uses a username provided by the user of an application as part of the discriminant on a Users table.

C#
public User GetUserByUsername(string typedUsername)
{
    string strSQL = @"SELECT *
                      FROM Users
                      WHERE username = '" + typedUsername + "'";
    //                                 ^                     ^
    //                                 Single quotes around inputed string
    ...
}

The SQL statement clearly expects a string that will contain a username. If it is the case, the expected user will be selected and the routine will behave as expected. However, if the attacker suspects that an SQL database is being used, she can provide input that manipulates the query to return a value where no rows should have been selected. A classic approach consists of adding a condition that will always return true such as:

SQL
' OR '1'='1' -- '

to the initial query, in order to return all rows from the User table which can lead, depending on how the rest of the method is designed, to returning a well-formed User instance. As we can see in the code below, the string will be appended to the query, resulting in a valid SQL statement. How the final SQL statement will be interpreted is also shown.

C#
public User GetUserByUsername(string typedUsername)
{
    SELECT *
    FROM Users
    WHERE username = '" + 'OR '1'='1' -- ' + "'";
    //               ^                        ^
    //               Single quotes around inputed string
    ...
}
SQL
SELECT *
FROM Users
WHERE username = ''
OR '1'='1'
-- '' 

SQL injection is certainly not limited to collecting information regarding a database scheme, or getting unintended access into a system. One can also use the data manipulation language (DML) instructions to update and/or delete rows from a table, or even drop a table. Also, it should be noted that injection attacks are by no means restricted to SQL; they can also occur whenever a string is used as part of a system call.

Marking an Object as a Potential Threat

A crucial aspect of handling inputed data is to know if data can be trusted based on the origin of said data.

The literature shows different solutions for tracking the safety status of an object. Languages such as Ruby and Perl implement Taint checking, an elegant wait of keeping track of the level of trust that can be placed into an object. Taint checking is enforced by marking an object to be tainted if it comes from an untrusted source. That status can be transmitted to another object that touches it. In order to use it in an unsecured execution environment, the tainted object has to be analyzed first to make sure it poses no threat and then marked as cleared explicitly by the developer.

In Ruby, taint checking is closely associated to the SAFE-mode level a Ruby program is running in, 0 being the most lenient and 4 being the most paranoid. Explaining the particularities of each level is beyond the scope of this article, but every level over 0 forces explicit taint checking of externally supplied data [1]. Checking if an object is considered tainted can be done through the tainted? method of the Object super-class. While the object is considered tainted, the Ruby script the object is in is forbidden from performing certain operations, depending on the SAFE level.

Ruby makes it is easy to clean a tainted object. Unless the SAFE mode is set to its highest levels, any object can be cleared by invoking the untaint method on it, which takes no parameters. Ruby does not force any preliminary check before that method is invoked, which is left at the discretion of the developer.

Perl provides a relatively simple mechanism to enforce Taint checking called Taint mode. That mode is automatically entered in some circumstances, such as when a Perl program opens a file that the user that executed the program doesn't own [2]. The Taint mode can also be entered explicitly by providing the -T argument at the command line when starting the Perl interpreter. When the Taint mode is entered, the Perl interpreter will stay in that mode for the reminder of the script (ibid.) When Taint mode is on, using tainted data in a way that could be dangerous will trigger an "Insecure dependency" (fatal) error message. A dangerous operation would be, for example, to write to a file which name is in a tainted variable, or, even worse, to execute the content of the variable as a system call.

Unlike Ruby, Perl does not provide an explicit "untaint" method. Untainting is performed through evaluating a regular expression on the tainted data. Resulting matching groups will be considered untainted.

Our Solution

Having recently been mandated with enforcing the safety of an authentication routine in a C# program, we have been disappointed to discover that no such mechanism seems to exists for .NET and C# in particular.

We thus propose to emulate part of the Taint checking solution in C# by using a generic Tainted container class that encapsulates a target object. That class provides methods to check the status of the target (whether it is tainted or not) as well as untainting and tainting it again. The Tainted class is shown below:

C#
public class Tainted<T>
{
    private bool _tainted;
    private T _target;

    public delegate bool IsCleanUntaintTreatmentMethod(T taintedObject);

    public Tainted(T target)
    {
        _tainted = true;
        _target = target;
    }

    public bool IsTainted
    {
        get { return _tainted; }
    }

    public bool IsClean
    {
        get { return !this.IsTainted; }
    }

    public T Target
    {
        get
        {
            if(this.IsTainted)
            {
                throw new TaintException();
            }
            
            return _target;
        }
    }

    public void Taint()
    {
        _tainted = true;
    }

    public void Untaint(IsCleanUntaintTreatmentMethod treatmentMethod)
    {
        _tainted = !treatmentMethod(_target);
    }
}

The basic idea behind this class is that whenever data is obtained from outside the program, the object that data is kept in has to be encapsulated (thereafter called the target) into an instance of the Tainted class. At that moment, the target is considered tainted. Access to it is only allowed through a public Target getter property. If the target has been untainted, the getter will return it, otherwise it will still be considered unsafe to use. In that case, in order to prevent the code that needs the target from being exposed to the threat the target represents, the getter will raise a TaintException and will not return the target.

That condition is a crucial element of this solution. Before it can be freely accessed, the target has to be cleared first. This can seem paradoxal, as the target has to be accessed to be analyzed. The idea is thus to provide the tainted target only to a method designed to verify it. That method (thereafter called the untainter) is provided as a parameter to the untaint method. That untainter's signature has to match the signature of the IsCleanUntaintTreatmentMethod delegate; it receives the target as a parameter, and must return true if the target is safe (and must be declared untainted) or false otherwise.

The untainter(s) then has to be developed, and contains two methods. The first method, IsFreeOfSQLInjectionUntainter, receives the target string and returns true if the string does not contain any of the strings that are generally used in SQL injection attacks. We have found those SQL keywords and characters on a website [3]. The second method just returns true, and is used when there is no need to actually verify a target string, such as when the data is hashed before being used. We can take a look at the code below:

C#
public static class StringUntainter
{
    private static string [] TabBadStrings = new string 
    { "select", "drop", ";", "--", "insert", "delete", "xp_", "%", "&", 
	"'", "(", ")", "/", "\\", ":", ";", "<", ">", "=", "[", "]", "?",
	"`", "|" };

    public static string IsFreeOfSQLInjectionUntainter(string target)
    {
        string taintedStringLower = target.ToLower();

        return !TabBadStrings.Any( s => taintedStringLower.Contains(s) );
    }

    public static string NOPUntainter(string target)
    {
        return true;
    }
}

We can now put everything above in an example. The SignIn method shown below receives two tainted strings: a username and a password. We provide the username to the User database broker, which in turn provides the IsFreeOfSQLInjectionUntainter of the StringUntainter class as a parameter to the Untaint method of the taintedUsername parameter. The broker then ensures that the object is no longer flagged as tainted, and raises an SQLInjectionException otherwise that the SignIn method knows how to handle.

Once the username is untainted, the value can be used in an SQL statement to fetch a User from the database. In our example, if the username has been found and a user fetched, we then provide the tainted password to the HashPasswordForSignIn method. Since that method uses a hashing algorithm on the tainted string, it is not susceptible to be attacked through SQL injection and does not need any further analysis. We thus use the NOPUntainter of the StringUntainter class, which untaints the password, and hash the latter.

C#
public class Authentication
{
    public bool SignIn(Tainted<string> taintedUsername, Tainted<string> taintedPassword)
    {
        bool authenticationSuceeded = false;

        try
        {
            User existingUser = 
		UserBroker.getInstance().GetUserByUsername(taintedUsername);
            
            if(existingUser != null)
            {
                if(existingUser.HashedPassword.equals
			(this.HashPasswordForSignIn(taintedPassword)))
                {
                    authenticationSuceeded = true;
                }
                else
                {
                    ...
                }
            }
            else
            {
              ...
            }
        }
        catch (SQLInjectionException e)
        {
            ...
        }
        return authenticationSuceeded;
    }

    public string HashPasswordForSignIn(Tainted<string> taintedPassword)
    {
        // Since the password string will be hashed, it poses no threat of SQL Injection.
        // We just use a "No-check" untainter and then hash the target.
        taintedPassword.Untaint( new Tainted<string>.IsCleanUntaintTreatmentMethod
        ( StringUntainter.NOPUntainter ) );
        return MyHasher.Hash(taintedPassword.Target);
    }
}
C#
public class UserBroker
{
    ...

    public User GetUserByUsername(Tainted<string> taintedUsername)
    {
        taintedUsername.Untaint( new Tainted<string>.IsCleanUntaintTreatmentMethod
		( StringUntainter.IsFreeOfSQLInjectionUntainter ) );

        if(taintedUsername.IsTainted)
        {
            throw new SQLInjectionException();
        }

        return this.GetUserByUsername(taintedUsername.Target);
    }

    private User GetUserByUsername(string username)
    {
        ...
    }
}

Points of Interest

In this article, we presented a simple solution to use Taint checking in C#. Our solution decouples the state of the object from the implementation of the cleansing algorithms, which leads to a generic Tainted class. That class could be used with any object, which makes it very reusable, and certainly not limited to primitives, but also to complex objects (such as File objects). The delegate approach used provides type-safety to the implementation of the untainting algorithm, which is an interesting example of the Strategy design pattern. The untainting can then be done from the point of view of the code carrying a risk ; in our case, the database broker "knows" what could represent a threat to itself and is making sure that the tainted username does not. Although we did not illustrate it, the tainted string, once untainted, could have been tainted again by invoking the Taint method should that object been used by some other sensitive routine later. It should also be noted that making the untainting close to the code that uses the sensitive data increases coherency.

We could have selected a more robust approach to untainting that would have imitated Perl by eliminating the possibility of untainting an object and modifying the Target property by converting it to a method that takes a delegate and returns the target object if the delegate returns true. Even if that approach would have reduced the risks of providing an already untainted object to a sensitive segment of code or module, it would have decreased performances by executing the cleansing algorithm every time the target is accessed (if the code making the access is implemented naively). Also, it would not have provided any additional protection than our solution if a NOP cleanser was used.

Our solution, while useful, is certainly not as secure as it would be if Taint checking was integrated natively into .NET. The developer has to know the origin of data and manually encapsulate it into an instance of the Tainted class instead of that operation being enforced by the language itself.

Our authentication example is also voluntarily simple, to make it easier to understand. It is inherently insecure as the username and password are sent on the network in cleartext. More modern approach would hash or encrypt that information to allow them to be safely transmitted over a public or unsecured network.

An opposite approach to Taint checking is called Trademarking [4, p.18]. Trademarking consists of explicitly whitelisting data by keeping a list of objects which have been trademarked (deemed safe) by a ApplyTrademark method [5]. Sensitive code that wishes to use the trademarked code has to make sure that the object has been trademarked through a VerifyTrademark method, which returns true if the object is safe to use (ibid.) A disadvantage of that approach is that it makes it more difficult to ensure that an object that has been deemed safe is indeed safe to use in a particular context, which could be problematic if the data is trademarked in a module and used in another one. We can imagine data that was initially analyzed and considered safe in a module for use in a SQL database context. That data is then provided to a module that will use it as part of a system call. In that situation, even if the object has been trademarked the first context, nothing guarantees that it will be safe for the second one. In our approach, however, the object, after having been untainted for use by the first module, could be tainted explicitly before being provided to the second module, which would eliminate the risk of false negatives.

History

  • 2011/03/14 First version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
Canada Canada
Paul Lessard has received a MSc in computer science and a BASc in computer science and software engineering. He is currently employed as a software developer and junior architect.

Comments and Discussions

 
GeneralMy vote of 5 Pin
Kanasz Robert1-Dec-11 2:50
professionalKanasz Robert1-Dec-11 2:50 
GeneralOr... Pin
Qwertie5-Apr-11 8:44
Qwertie5-Apr-11 8:44 
GeneralMy vote of 5 Pin
John Adams18-Mar-11 17:30
John Adams18-Mar-11 17:30 
QuestionWhat about parameterized SQL? Pin
tmbgfan17-Mar-11 4:38
tmbgfan17-Mar-11 4:38 
AnswerRe: What about parameterized SQL? Pin
Paul Lessard, M.Sc.17-Mar-11 7:03
Paul Lessard, M.Sc.17-Mar-11 7:03 
GeneralRe: What about parameterized SQL? Pin
tmbgfan17-Mar-11 7:40
tmbgfan17-Mar-11 7:40 
GeneralRe: What about parameterized SQL? Pin
taylorza17-Sep-11 3:49
professionaltaylorza17-Sep-11 3:49 
GeneralRe: What about parameterized SQL? Pin
Pete O'Hanlon17-Mar-11 10:36
subeditorPete O'Hanlon17-Mar-11 10:36 
GeneralMy vote of 5 Pin
Ahmad Hyari17-Mar-11 0:21
Ahmad Hyari17-Mar-11 0:21 
GeneralRe: My vote of 5 Pin
Paul Lessard, M.Sc.17-Mar-11 2:24
Paul Lessard, M.Sc.17-Mar-11 2:24 
You're welcome, Ahmad. Looking forward to see your work, too.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.