Integer Arrays Using T-SQL

Chris_Martinez

4.30/5 (30 votes)

May 9, 2006

6 min read

140677

974

How to pass an array of integers to SQL Server

Preface

I've been using CodeProject as a research source for a number of years now. I figured it was about time that I gave back to the development community with a contribution of my own. I haven't seen anyone else take this approach to passing arrays before, so I hope my first article is not in vain.

Introduction

Anyone that has worked with SQL Server for a while undoubtedly has come across a scenario where she/he needed a way to pass an array as a parameter to a stored procedure. One of the most common of these scenarios involves a stored procedure that must be executed passing numerous primary keys, which are typically integer-based. Currently, it is not possible in T-SQL to pass a traditional array from an external programming environment as a parameter to a stored procedure or to a user-defined function. One approach to overcome this is to build a SQL statement on the client-side with all of the literal values supplied. Unfortunately, this type of approach is often not scalable and prone to injection attacks. I’ve also seen other approaches in T-SQL, all of which involved creating a delimited string and splitting it up on the server side. Although that does work, there is a far more reusable and efficient way to perform this task using a little T-SQL magic provided that the parameters are integer-based.

Background

An array is nothing more than a block of contiguous allocated memory. In the case of an array of strings, it is actually an array of arrays, where each string is an array of characters. Unfortunately there is no way to represent that type of data structure in T-SQL without using the dynamic SQL or delimited string approach. However, what about arrays with fixed-size memory elements such as integers? As it turns out, it is possible to represent this type of data in raw binary form using any of the following data types: BINARY, VARBINARY, and IMAGE. Since the IMAGE data type has a 2GB maximum, and the largest segment extracted from this type of variable in this case is less than 8000 bytes, it makes the most appropriate choice.

Now that you know that an array could be represented in T-SQL, you’re probably wondering how you could access it or make use of it. As it turns out, the SUBSTRING function is capable of working with binary data. Since slicing up a string is really nothing more than a little pointer manipulation, this makes perfect sense. We are able to use the DATALENGTH function to determine the actual length of data passed and extract the parts of the binary into appropriately sized chunks where the pieces can be used in their integer form (TINYINT, SMALLINT, INT, and BIGINT respectively) without performing any data type casting. Of course, the size of the integer ultimately determines how many elements the array can actually have, but even a BIGINT array could have somewhere in the neighborhood of 256,000 elements (2GB with 8 bytes per element), which should be more than enough.

A single stored procedure parameter can be easily separated into multiple pieces, but how can we use it like an array? The answer is a table-valued, user-defined function. Essentially, the binary parameter is passed to the function, divided into its respective pieces, and inserted into a memory table. Admittedly, this is very similar to the conventional delimited string approach except that it is much faster because no character searching or type casting takes place during the separation of values. The table that is returned contains two columns, Index and Value, which function just like an array. You may find that there is no need for an Index column and you could remove it to further improve performance. I assume that there may be a case in which an element may need to be accessed by its ordinal position just like in a regular array. Without the Index column, there would not be a way to do this unless you use a cursor.

To simplify the solution, I’ve created a single, resuable user-defined function called udfIntegerArray, which accepts the array in binary form and the size of the array elements, in bytes (which must be 1, 2, 4, or 8 respectively).

Using the Code

The following demonstrates how to use the function in a stored procedure:

CREATE PROCEDURE dbo.uspProductsSearch
(
 @ProductIDs IMAGE
)
AS
-- find all products in Northwind that match the specified list of identifiers
SELECT
  *
FROM
  Products
WHERE
  ProductID IN ( SELECT Value FROM dbo.udfIntegerArray( @ProductIDs, 4 ) );

You can simulate the parameter value using a literal binary value. The following demonstrates how to represent 1, 3, and 15 in literal binary form, where Products.ProductID is a 32-bit integer:

-- it is not possible to declare a variable of type IMAGE, but you can pass
-- a VARBINARY variable that will implicitly cast to the IMAGE data type
DECLARE @ProductIDs VARBINARY(8000);
SET @ProductIDs = 0x00000001000000030000000F;

Points of Interest

Interestingly, I discovered that T-SQL supports equality of all intrinsic integer data types against their binary equivalents, which from a low-level perspective is completely palatable. This finding is important because that means only one user-defined function is needed to support all of the integer data types and explicit type casting is not required.

Client-Side Implementation

I've demonstrated how an array can be implemented on the database side, but how can an array be assigned to a parameter in binary form for a stored procedure? Surprisingly it isn’t very difficult to do in .NET, but to simplify the task I’ve created the BinaryIntegerArray and BinaryIntegerArrayFactory classes. The BinaryIntegerArray class is a generic class that is a List<T> of integers, where T is System.Int16, System.UInt16, System.Int32, System.UInt32, System.Int64, or System.UInt64. Since a type constraint cannot be applied in a way that would restrict the usable types to this list, the constructor is marked as internal and the BinaryIntegerArrayFactory provides methods that will return an appropriate implementation of BinaryIntegerArray.

The BinaryIntegerArray class has all of the functionality of List<T>, but adds the ToBinary() and ToHexadecimal() methods that return the contents of the list in the respective binary or hexadecimal format.

Keep in mind that although the BinaryIntegerArray class supports unsigned integers, SQL Server does not. Using any of these implementations to supply an array to SQL Server can result in exceptions or unintended results.

Putting It All Together

The following is a snippet from the example code that illustrates how to create an array of integers and assign the binary equivalent value to the parameter of a stored procedure:

// create a binary array of 32-bit integers
string cs = "Server=localhost;Database=Northwind;Integrated Security=SSPI";
BinaryIntegerArray<int> ids = BinaryIntegerArrayFactory.CreateInt32();

// add integers
ids.Add( 1 );
ids.Add( 3 );
ids.Add( 15 );

// create a connection
using ( SqlConnection connection = new SqlConnection( cs ) )
{
    // create a command
    using ( SqlCommand command = new SqlCommand( "dbo.uspProductsSearch", connection ) )
    {
        // set command type
        command.CommandType = CommandType.StoredProcedure;

        // add parameters
        SqlParameter param = command.Parameters.Add( "@ProductIDs", SqlDbType.Binary );
        param.Value = ids.ToBinary();

        // open connection and execute command
        connection.Open();
        SqlDataReader reader = command.ExecuteReader( CommandBehavior.CloseConnection );

        while ( reader.Read() )
        {
            // TODO: do something with the results
        }

        // close the reader
        reader.Close();
    }
}

Conclusion

Unfortunately, traditional arrays in T-SQL will be a long time coming, if ever. Hopefully, I have clearly demonstrated that although traditional methodologies cannot be used, it is possible to pass an array in binary form if the elements are in fixed-memory segments.

Remarks

Although it is extra work, I do feel it is important that multi-language examples are key to reaching a broader audience. Therefore, the source code and examples provided have both C# and VB.NET implementations.

Revisions

05/30/2006
Replaced implementation of VARBINARY(8000) with IMAGE. Only 8000 bytes can be extracted from the IMAGE data type at a time without using the TEXTPTR functions. Since none of the elements are remotely near this size, the IMAGE data type provides a more flexible implementation. The corresponding source SQL scripts were updated.