MD5 Hash SQL Server Extended Stored Procedure






4.93/5 (89 votes)
Jan 27, 2004
2 min read

912183

11377
An extended stored procedure for SQL Server that implements an optimized MD5 hash algorithm. Very small DLL (barely 7 KB).
Introduction
This is an extended stored procedure for Microsoft SQL Server 2000 that implements an optimized MD5 hash algorithm. It is intended to work much like the MySQL MD5()
function. The algorithm was taken from here. I only added the md5_string()
function. The DLL should work for older versions of SQL Server too, although I have not tested that. The source was compiled and tested on Microsoft Visual C++ 6.0 and .NET 2003.
Installation
- Extract or build the DLL file xp_md5.dll and place it in C:\Program Files\Microsoft SQL Server\MSSQL\Binn (or wherever appropriate). A precompiled DLL is in the Release directory of the source distribution.
- Create an Extended Stored Procedure called
xp_md5
in the "master" database. Right-click "Extended Stored Procedures" under the master database in the Server Manager and click "New Extended Stored Procedure...". Enter xp_md5 for the "Name" and for the "Path", enter the full path to xp_md5.dll.Note: If you want to add it manually:
USE master; EXEC sp_addextendedproc 'xp_md5', 'xp_md5.dll'
- Create a user-defined function for each database in which you plan to use the MD5 procedure. Right-click "User Defined Functions" under the appropriate database(s) and click "New User Defined Function...". Enter the following:
CREATE FUNCTION [dbo].[fn_md5] (@data TEXT) RETURNS CHAR(32) AS BEGIN DECLARE @hash CHAR(32) EXEC master.dbo.xp_md5 @data, -1, @hash OUTPUT RETURN @hash END
- (Optional) Create other user-defined functions to let you specify the data length (for substrings,
BINARY
and other fixed-width types). In this particular function, we take anIMAGE
for input and an optionalLENGTH
. A negativeLENGTH
value causes the DLL to try to compute the length of the input automatically (this is the default):CREATE FUNCTION [dbo].[fn_md5x] (@data IMAGE, @len INT = -1) RETURNS CHAR(32) AS BEGIN DECLARE @hash CHAR(32) EXEC master.dbo.xp_md5 @data, @len, @hash OUTPUT RETURN @hash END
Usage:
FN_MD5
The User-Defined Functions can be used as follows:
-- Sample SQL statement:
-- fn_md5() tries to automatically calculate the length of the input string
SELECT dbo.fn_md5('Hello world!');
-- fn_md5x() takes an optional length arg for substrings, fixed-width types, etc.
SELECT dbo.fn_md5x('Hello world!', 12);
Output for both statements:
86fb269d190d2c85f6e0468ceca42a20
XP_MD5: EXEC xp_md5 <@data> [@length = -1] [@hash OUTPUT]
To use the Extended Stored Procedure directly:
-- Sample command:
EXEC master.dbo.xp_md5 'Hello world!'
Output:
86fb269d190d2c85f6e0468ceca42a20
Or if you want the result saved to a variable instead:
DECLARE @hash CHAR(32)
EXEC master.dbo.xp_md5 'Hello world!', -1, @hash OUTPUT
PRINT @hash
Output:
86fb269d190d2c85f6e0468ceca42a20
Examples
-- These are just examples for use with the given functions above.
-- Feel free to create your own functions that take the input type you want.
-- Use CAST() with caution, as it may truncate and have other unintended consequences.
-- For TEXT, NTEXT, VARCHAR:
SELECT dbo.fn_md5(data) FROM table;
-- For VARBINARY:
SELECT dbo.fn_md5(CAST(data AS VARCHAR(8000))) FROM table;
-- For IMAGE:
SELECT dbo.fn_md5x(data, DEFAULT) FROM table;
-- For CHAR: (we can use LEN() on this fixed-width type to trim the padding)
SELECT dbo.fn_md5x(data, LEN(data)) FROM table;
-- For NCHAR: (we can use LEN() on this fixed-width type to trim the padding)
SELECT dbo.fn_md5x(CAST(data AS CHAR(4000)), LEN(data)) FROM table;
-- For BINARY:
SELECT dbo.fn_md5x(data, 12) FROM table;
-- FOR SQL_VARIANT:
SELECT dbo.fn_md5x(CAST(data AS VARCHAR(8000)), DATALENGTH(data)) FROM table;
Miscellaneous
For purposes of speed, I did not include any real input data verification (e.g., type checking, data length checking, etc.). I opted to exclude that in order to maximize speed, as I originally wrote this for use in an application that inserts millions of rows at a time. I also know that I'm always calling the procedure properly. If you want to make it more robust - like if you do not know what kind of data will be passed to the function - then I highly recommend you add those safeguards.
One last thing, I added the linker option /OPT:NOWIN98 to minimize the binary size. This may cause a performance hit on non-NT systems (e.g., Win95, Win98, etc.). If you're using the DLL on such a system, I would recompile it without that linker option.
Cheers.