|
<context>SQL Server 2008 R2</context>
Someone added a function to my database! Well OK, it's a team effort so that's fine, but it's incorrect and not well-written.
What I think the purpose is is to take a column name in PascalCase and make it look more presentable by adding a SPACE before each capital (uppercase) letter. That's what I understand from the name and it's a reasonable thing to do and it does that.
Buuut... what it really does is add a SPACE before each non-lowercase letter (e.g. digits, symbols, whitespace).
Aside from producing unexpected output it's also hard to read and difficult to maintain -- I opted to rewrite it rather than try to fix it. Here's the original:
CREATE FUNCTION [dbo].[SpaceBeforeCap]
(
@str nvarchar(max)
)
returns nvarchar(max)
as
begin
declare @i int, @j int
declare @returnval nvarchar(max)
set @returnval = ''
select @i = 1, @j = len(@str)
declare @w nvarchar(max)
while @i <= @j
begin
if substring(@str,@i,1) = UPPER(substring(@str,@i,1)) collate Latin1_General_CS_AS
begin
if @w is not null
set @returnval = @returnval + ' ' + @w
set @w = substring(@str,@i,1)
end
else
set @w = @w + substring(@str,@i,1)
set @i = @i + 1
end
if @w is not null
set @returnval = @returnval + ' ' + @w
return ltrim(@returnval)
end
And here's mine (I added the fn wart because that's the standard I inherited ):
CREATE FUNCTION [dbo].[fnSpaceBeforeCap]
( @str NVARCHAR(MAX)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
DECLARE @offset INTEGER
SET @offset = PATINDEX ( '%[^ ][A-Z]%' , @str COLLATE Latin1_General_BIN )
WHILE ( @offset > 0 )
BEGIN
SET @str = STUFF ( @str , @offset + 1 , 0 , ' ' )
SET @offset = PATINDEX ( '%[^ ][A-Z]%' , @str COLLATE Latin1_General_BIN )
END
RETURN @str
END
It uses PATINDEX to find a non-SPACE followed by an uppercase letter then STUFFs a SPACE between them.
I'd like to say that mine is more efficient, but I don't feel like doing any testing. What I will point out is that the original uses SUBSTRING (twice!) to test each character and SUBSTRING to form the new value, whereas mine has that kind of thing hidden in black boxes. What I don't like about mine is that PATINDEX doesn't allow a parameter to tell it where to start so it always starts over from the beginning -- see also Schlemiel the painter's Algorithm[^].
|
|
|
|
|
it's ok to insert a spece on the begining of the string? it seems that both are doing that...
I'm brazilian and english (well, human languages in general) aren't my best skill, so, sorry by my english. (if you want we can speak in C# or VB.Net =p)
|
|
|
|
|
No, the original inserts one there then uses LTRIM to remove it. Which means it will also remove any pre-existing SPACEs, which may not be as intended.
Mine will neither add nor remove a leading SPACE.
|
|
|
|
|
now i get why your's doesn't insert a space there, i'm not that good with sql =p
I'm brazilian and english (well, human languages in general) aren't my best skill, so, sorry by my english. (if you want we can speak in C# or VB.Net =p)
|
|
|
|
|
PIEBALDconsult wrote: PATINDEX doesn't allow a parameter to tell it where to start so it always starts over from the beginning
Not only that, but the repeated use of stuff creates the same kind of inefficiency as you'd see in C# with the repeated use of "+" to concatenate strings. There is no StringBuilder in SQL, but there are workarounds. If you really want performance, this is what I'd do (describing rather than writing out, because I'm lazy)...
Create a while loop to visit each character in the string. Perform PATINDEX only on single characters (or two-character sequences, if you like). Output the strings to a local table variable then combine them using the FOR XML PATH technique (I haven't tried it myself, but it seems like it'd avoid repeat concatenations).
|
|
|
|
|
AspDotNetDev wrote: stuff creates the same kind of inefficiency as you'd see in C# with the repeated
use of "+" to concatenate strings
It is to be hoped that the STUFF function isn't implemented in SQL.
My main concern is correctitude; after that, readability and maintainability; efficiency isn't a primary concern -- the source values will generally be short (column names) with few (if any) STUFFs.
AspDotNetDev wrote: visit each character in the string
You mean SUBSTRING on each character? You think that's better?
AspDotNetDev wrote: a local table variable
I'd rather keep instantiating/destructing strings than tables.
Were I really concerned I'd write a CLR function instead.
|
|
|
|
|
PIEBALDconsult wrote: You mean SUBSTRING on each character? You think that's better?
PIEBALDconsult wrote: the source values will generally be short (column names) with few (if any) STUFFs
It would be if you were dealing with lots of string operations, but since you won't be then you are right to KISS.
|
|
|
|