I have written T-SQL code to update XML records in a potentially very large table (let's say millions of rows) with each row having an XML column. The code changes the values of all nodes in the XML that match a user-defined XPath expression. So for example, assume the XML is:
<myrec>
<amt>1</amt>
<main>
<amt bonus="11">2</amt>
<supp>
<amtort>referrer</amtort>
<amt>3</amt>
<rows>
<row>
<amt bonuses="999">4</amt>
</row>
<row>
<amt>5</amt>
</row>
<row>
<someotherlevel>
<amt bonus="22">6</amt>
<keep>optional</keep>
</someotherlevel>
</row>
</rows>
</supp>
</main>
</myrec>
And assume the user-defined XPath is //amt, then the code (as an example) would add 100 to each data element of <amt>. The XML would then become:
<myrec>
<amt>101</amt>
<main>
<amt bonus="11">102</amt>
<supp>
<amtort>referrer</amtort>
<amt>103</amt>
<rows>
<row>
<amt bonuses="999">104</amt>
</row>
<row>
<amt>105</amt>
</row>
<row>
<someotherlevel>
<amt bonus="22">106</amt>
<keep>optional</keep>
</someotherlevel>
</row>
</rows>
</supp>
</main>
</myrec>
This would get done for all <amt> nodes in the XML as per the XPath //amt.
Note that the code to works for any XML document and any (user entered) XPath. So this is general purpose code.
My code works, but I have a feeling there is a better way to do this since I am not getting the performance I was hoping for (currently takes about 266 seconds to update 1 million rows (single XML node updated per row) on an Azure B2s instance (2 VCPU's, 4GB RAM, SSD).
The code I've written uses iteration and the .modify() function on and XML type variable. I've also resorted to using dynamic SQL so I can build arbitrary XQuery strings (as mentioned, this is general purpose code for any XML / XPath).
Can anyone suggest a better / more performant approach? I'm new to XML in MSSQL, so still feeling my way through it.
My code is below:
What I have tried:
CREATE PROCEDURE dbo.ModifyXML
@var_xpath VARCHAR(max),
@var_rec XML,
@new_xml XML OUTPUT
AS
SET NOCOUNT ON
DECLARE @sql_matches NVARCHAR(max);
DECLARE @sql_modify NVARCHAR(max);
DECLARE @sql_get_new NVARCHAR(max);
DECLARE @new_value INTEGER;
DECLARE @i INTEGER;
SET @sql_modify = N'';
SET @new_xml = @var_rec
SET @i = 1;
SET @sql_get_new = N'SET @new_value = dbo.my_transform(@xml.value(N''(' + @var_xpath + '/text())[1]'', ''INTEGER''));';
EXECUTE sp_executesql @sql_get_new, N'@xml XML, @new_value INTEGER OUTPUT', @xml = @var_rec, @new_value = @new_value OUTPUT;
WHILE @new_value is not NULL
BEGIN
SET @sql_modify = @sql_modify + N'SET @new_xml.modify(''replace value of (' + @var_xpath + '/text())[' + CAST(@i AS NVARCHAR(max)) +'] with "' + CAST(@new_value AS NVARCHAR(max)) + '"'');';
SET @i = @i + 1;
SET @sql_get_new = N'SET @new_value = dbo.my_transform(@xml.value(N''(' + @var_xpath + '/text())[' + CAST(@i AS NVARCHAR(max)) + ']'', ''INTEGER''));';
EXECUTE sp_executesql @sql_get_new, N'@xml XML, @new_value INTEGER OUTPUT', @xml = @var_rec, @new_value = @new_value OUTPUT;
END
IF @sql_modify <> '' EXECUTE sp_executesql @sql_modify, N'@new_xml XML OUTPUT', @new_xml = @new_xml OUTPUT;
GO