Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

MD5 Hash SQL Server Extended Stored Procedure

0.00/5 (No votes)
7 Mar 2005 1  
An extended stored procedure for SQL Server that implements an optimized MD5 hash algorithm. Very small DLL (barely 7 KB).

Introduction

This is an extended stored procedure for Microsoft SQL Server 2000 that implements an optimized MD5 hash algorithm. It is intended to work much like the MySQL MD5() function. The algorithm was taken from here. I only added the md5_string() function. The DLL should work for older versions of SQL Server too, although I have not tested that. The source was compiled and tested on Microsoft Visual C++ 6.0 and .NET 2003.

Installation

  1. Extract or build the DLL file xp_md5.dll and place it in C:\Program Files\Microsoft SQL Server\MSSQL\Binn (or wherever appropriate). A precompiled DLL is in the Release directory of the source distribution.
  2. Create an Extended Stored Procedure called xp_md5 in the "master" database. Right-click "Extended Stored Procedures" under the master database in the Server Manager and click "New Extended Stored Procedure...". Enter xp_md5 for the "Name" and for the "Path", enter the full path to xp_md5.dll.

    Note: If you want to add it manually:

    USE master;
    EXEC sp_addextendedproc 'xp_md5', 'xp_md5.dll'
  3. Create a user-defined function for each database in which you plan to use the MD5 procedure. Right-click "User Defined Functions" under the appropriate database(s) and click "New User Defined Function...". Enter the following:
    CREATE FUNCTION [dbo].[fn_md5] (@data TEXT) 
    RETURNS CHAR(32) AS
    BEGIN
      DECLARE @hash CHAR(32)
      EXEC master.dbo.xp_md5 @data, -1, @hash OUTPUT
      RETURN @hash
    END
  4. (Optional) Create other user-defined functions to let you specify the data length (for substrings, BINARY and other fixed-width types). In this particular function, we take an IMAGE for input and an optional LENGTH. A negative LENGTH value causes the DLL to try to compute the length of the input automatically (this is the default):
    CREATE FUNCTION [dbo].[fn_md5x] (@data IMAGE, @len INT = -1) 
    RETURNS CHAR(32) AS
    BEGIN
      DECLARE @hash CHAR(32)
      EXEC master.dbo.xp_md5 @data, @len, @hash OUTPUT
      RETURN @hash
    END

Usage:

FN_MD5

The User-Defined Functions can be used as follows:

-- Sample SQL statement:

-- fn_md5() tries to automatically calculate the length of the input string

SELECT dbo.fn_md5('Hello world!');

-- fn_md5x() takes an optional length arg for substrings, fixed-width types, etc.

SELECT dbo.fn_md5x('Hello world!', 12);

Output for both statements:

86fb269d190d2c85f6e0468ceca42a20

XP_MD5: EXEC xp_md5 <@data> [@length = -1] [@hash OUTPUT]

To use the Extended Stored Procedure directly:

-- Sample command:

EXEC master.dbo.xp_md5 'Hello world!'

Output:

86fb269d190d2c85f6e0468ceca42a20

Or if you want the result saved to a variable instead:

DECLARE @hash CHAR(32)
EXEC master.dbo.xp_md5 'Hello world!', -1, @hash OUTPUT
PRINT @hash

Output:

86fb269d190d2c85f6e0468ceca42a20

Examples

-- These are just examples for use with the given functions above.

-- Feel free to create your own functions that take the input type you want.

-- Use CAST() with caution, as it may truncate and have other unintended consequences.


-- For TEXT, NTEXT, VARCHAR:

SELECT dbo.fn_md5(data) FROM table;

-- For VARBINARY:

SELECT dbo.fn_md5(CAST(data AS VARCHAR(8000))) FROM table;

-- For IMAGE:

SELECT dbo.fn_md5x(data, DEFAULT) FROM table;

-- For CHAR: (we can use LEN() on this fixed-width type to trim the padding)

SELECT dbo.fn_md5x(data, LEN(data)) FROM table;

-- For NCHAR: (we can use LEN() on this fixed-width type to trim the padding)

SELECT dbo.fn_md5x(CAST(data AS CHAR(4000)), LEN(data)) FROM table;

-- For BINARY:

SELECT dbo.fn_md5x(data, 12) FROM table;

-- FOR SQL_VARIANT:

SELECT dbo.fn_md5x(CAST(data AS VARCHAR(8000)), DATALENGTH(data)) FROM table;

Miscellaneous

For purposes of speed, I did not include any real input data verification (e.g., type checking, data length checking, etc.). I opted to exclude that in order to maximize speed, as I originally wrote this for use in an application that inserts millions of rows at a time. I also know that I'm always calling the procedure properly. If you want to make it more robust - like if you do not know what kind of data will be passed to the function - then I highly recommend you add those safeguards.

One last thing, I added the linker option /OPT:NOWIN98 to minimize the binary size. This may cause a performance hit on non-NT systems (e.g., Win95, Win98, etc.). If you're using the DLL on such a system, I would recompile it without that linker option.

Cheers.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here