Recently, I created a web application to track jobs for Civic Screens. The application called for a "Job Number" that wasn't actually a number. The Job Number was a number with the possibility of being prefixed with a letter. For example, the following were valid job numbers: J123, W34, 1, 5000 etc. After the application was delivered, they asked for an enhancement to be able to search by a range of job numbers, e.g., all job numbers between J1 - J500. Seeing as the application was already delivered, with the
JobNumber column set as an
nvarchar(20), I wanted to come up with a quick way of implementing this. Therefore, splitting the job number up into an alpha and a numeric column was out of the question.
After doing some quick Google searches to see what, if any, solutions have been implemented previously, I came across a CodeProject article written by Trent Tobler. It was a part solution to my problem. He had already implemented a method for sorting, I just needed to incorporate it into SQL Server as efficiently as possible. Admittedly, the final solution is not the best and not the most efficient way of implementing this; at a guess, I suggested the best way would be to separate the job number into an alpha and a numeric column. Due to time constraints and wanting to make minimal amount of changes as possible, I decided to go with Alpha Numeric Sort and use SQL Server's CLR feature.
I will skip the part on how the code comes up with a sortable string, because in the most part, the code has not been altered from Trent Tobler's article. The only changes made were the extension methods being changed to simple static methods. I have also made the fields read only, to make it easier to integrate into SQL Server. SQL Server does not permit static fields that are not read only when the assembly is added as a safe assembly. The last change was to remove the method overloading, as once again, SQL Server did not like the method to be used as a SQL function to be overloaded.
Setting up SQL Server 2008
Setting up SQL Server to use the compiled assembly is a relatively simple task, and Microsoft gives plenty of guidance on how to do this. I basically followed the instructions and links on 'Creating CLR Functions' to work out what to do. I used the instructions to configure SQL Server 2008 Express edition, but it should work on any version of SQL Server greater than 2005, Express or full. To summarize the article and the links, the basics are:
- Configure SQL Server 2008 to enable CLR integration.
sp_configure 'show advanced options', 1;
sp_configure 'clr enabled', 1;
- Create the SQL Server assembly.
Create ASSEMBLY AlphaNumeric
WITH PERMISSION_SET = SAFE;
- Create the SQL Server function.
CREATE FUNCTION [dbo].GetAlphaNumericOrderToken (@str nvarchar(40))
AS EXTERNAL NAME [AlphaNumeric].[AlphaNumericSort].[GetAlphaNumericOrderToken];
SQL Server CLR Integration
I'll finish the article with some handy things to know about CLR integration with SQL Server.
The first thing to note is that SQL Server does not link to a .NET assembly when the assembly is created in SQL Server. Instead, it takes some sort of copy of the assembly. Therefore, any changes made to the .NET assembly will not be reflected in the database until the assembly is added to the database again. This can be done by:
Alter ASSEMBLY AlphaNumeric
WITH PERMISSION_SET = SAFE;
There are some limitations to this though. The signatures of the referenced methods can not change. If they do, and this is where it can be painful, all objects (functions, Stored Procedures, types, etc.) created that use the .NET assembly must be dropped. The assembly must then be dropped, and finally the assembly recreated, then the database objects recreated. I did read somewhere that SQL Compare can automate this for you, but I have not used the product to know for sure that it does, and whether it is an effective way of accomplishing this. The minimal use I was using of the CLR feature in SQL Server did not warrant investigating this too much as my pain was more like a pin prick.
Improving the Performance
A major concern I had was the performance implication of searching and sorting by Job Number, as in actual fact I wanted to search and sort by the Job Number sort token. After a little research, I found that I could make a persisted computed column that could also have an index created on that column. The computed part obviously means that the column is calculated using the value of another column. The persisted part means that the value will be persisted in the data storage, rather than being calculated (virtual) on each access. For an index to be created on a computed column, the column must also be persisted. One caveat of this is that the SQL function must be deterministic. That is, for the same input and database state, the same output will always be returned. In first understanding deterministic functions, I found it easier to think of an example that was not deterministic. E.g., the
NOW() SQL function. The return value of the
NOW() function is not deterministic because it will not return the same result every time it is called. Being that the same sort token will be returned for the same Job Number, the
GetAlphaNumericOrderToken is deterministic. CLR SQL functions are required to have the
[SqlFunction(IsDeterministic = true)] attribute applied to the function in order for SQL Server to treat the function as deterministic.
ALTER TABLE dbo.Job ADD
JobNumberSortToken AS ([dbo].[GetAlphaNumericOrderToken]([JobNumber])) PERSISTED
CREATE NONCLUSTERED INDEX IX_JobNumber ON dbo.Job
) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]