Those of us who work in the Production / Live DB , know the pain , of firing a update query on a huge table . This is very problematic as this will lead to the following
- Lock escalation : The entire table will be locked by your application , so other users / application will not be able to perform the DDL on this table
- The TempDB will grow huge
- A update lock on a table will stop other users , for making changes in the table
I have written a below snippet which will help us fight the above issues
-- Step 1 : Declare the varaibles
use DBNAME
Declare @counter int
Declare @RowsEffected int
Declare @RowsCnt int
Declare @CodeId int
Declare @Err int
SELECT @COUNTER = 1
SELECT @RowsEffected = 0
/*
Step 2 : Get the value of the Code , with which we need to update the existing Code . In my case I am capturing is from a table , we can always hard code it .
*/
SELECT @CodeID = CodeID FROM CODE WHERE XXXX ='YYYY'
/*
Step 3: Start the while loop , if we have 100,000 records , and in each loop 5,000 records will be update , so the total number of cycle will be 100,000/5000 i.e 20
*/
WHILE ( @COUNTER > 0)
BEGIN
SET ROWCOUNT 5000
-- Note : The SET ROWCOUNT 5000 will just pick the top 5000 records */
/*UPDATING TABLE */
UPDATE Table
SET CodeID= @CodeID
WHERE Codeid = @OldCode
SELECT @RowsCnt = @@ROWCOUNT ,@Err = @@error
IF @Err <> 0
BEGIN
Print 'Problem Updating the records'
END
IF @RowsCnt = 0
SELECT @COUNTER = 0
ELSE
/* Increment the Counter */
SELECT @RowsEffected = @RowsEffected + @RowsCnt
PRINT 'The total number of rows effected :'+convert(varchar,@RowsEffected)
/*delaying the Loop for 10 secs , so that Update is comepleted*/
WAITFOR DELAY '00:00:10'
END
--Step 4 : Check if all the records are updated or not .
IF EXISTS ( SELECT CodeID
FROM Table (NOLOCK)
WHERE CodeID = @OldCodeid
)
BEGIN
PRINT ('All the records are not updated , there is some problem , Contact Devs ')
END
BEGIN
PRINT ('All the records are updated SUCCESSFULLY !!!!')
END
/* ------Set rowcount to default ----*/
SET ROWCOUNT 0
|
|
 |
 | Clumsy Implementation ncarey | 11:26 1 Feb '06 |
|
 |
For starters, the WAITFOR statement is completely unnecessary. Each statement in a batch executes synchronously. Under SQL Server's default "autocommit" mode (SET IMPLICIT_TRANSACTION OFF), each statement is atomic. Once the UPDATE statement has completed, the transaction has been COMMITed and all locks have been released.
A simpler implementation is something like the stored procedure below.The [potential] drawbacks to this technique are twofold:
1. An additional UPDATE is peformed that will update no rows (that is, if your batch size is 1000 and you have 5000 rows to update, 6 UPDATE statements will be executed: 5 will affect 1000 rows and the 6th will affect zero (0) rows. Depending on the size of your table, the semantics of your UPDATE and its execution plan, and other work going on in SQL Server, the additional work entailed by "doing it in chunks" might actually result in worse performance than doing it in a single large UPDATE. Consult your DBA (who probably has a better handle on system load and lock contention than you do) to see if this sort of optimization is necessary.)
2. This sort of optimization breaks the ACID (Atomicity, Consistency, Isolation and Durability) property of a relational database. You may intend to alter 1000 rows of a table. If you do it in a single transaction, the change will either occur or not occur. If you break it up into multiple transactions by chunking it like this, the state of the database may change mid-stream — other users may alter/delete/add data before you get to it, etc. Whether or not this is an issue, depends of course, a lot of different (and external) factors that can't be addressed in a general way: the exact semantics of your application and your data model for starters.
Anyway, here's a cleaner implementation. Note that the stored procedure could also, for instance, retrieve the 'batch size' from a configuration table (rather than receiving it as a parameter). This would allow the DBAs to tune the batch size as needed without required code changes or stored procedure changes.
dbo.sp_MassiveUpdate ----------------------------------------------- -- -- A stored procedure to do a big update against a table in small chunks -- so as to minimize lock contention and transaction length. -- ------------------------------------------------------------------------ create procedure dbo.sp_MassiveUpdate @old_value int , -- old value @new_value int , -- new value @batch_size int = null -- optional. If omitted or specified as 0 or NULL, -- the update will be done as a single, large transaction. as -- -- local variables -- declare @msg varchar(4000) , @rows_processed int -- -- standard SET options -- set nocount on -- eliminates extraneous network round-trips. set ansi_nulls on -- expect standard ANSI null behaviour set concat_null_yields_null on -- expect standard ANSI null behaviour set xact_abort on -- rollback and bail on error set implicit_transactions off -- ensure we're running under the default (autocommit) mode -- -- put optional argument into canonical form -- set @batch_size = case coalesce(@batch_size,0) when 0 then 0 else @batch_size end -- -- if small batches were asked for and we're already in a transaction, hurl -- if ( @batch_size <> 0 and @@TRANCOUNT > 0 ) begin set @msg = 'ERROR: ' + 'Can''t Update in Batches. ' + 'This stored procedure is running within an uncommitted transaction.' RAISERROR( @msg , 16 , 1 ) return -1 end -- -- do the massize update in batches, if requested. If @batch_size is 0, -- then the first batch will be the only batch ( SET ROWCOUNT 0 says that -- there are no limits on the number of rows to process -- set @rows_processed = -1 while ( @rows_processed <> 0 ) begin set rowcount @batch_size -- set the batch size update my_really_large_table -- perform an update set value_column = @new_value where value_column = @old_value set @rows_processed = @@ROWCOUNT -- get the number of rows affected set rowcount 0 -- restore the default batch size end -- -- return to the caller with the return code set. -- return 0 GO
|
|
|
|
 |
 | Suggestion: Before & After Drew Noakes | 23:14 30 Jan '06 |
|
 |
It would be interesting to see the unsuitable, unoptimized version of the code, before walking through the optimized version.
I was talking with a friend about this recently -- he's a DBA responsible for some very large bank databases, and he couldn't believe so many developers weren't aware of how to update large tables. Your article is certainly very useful. I encourage you to work on it, taking the suggestions people have posted here.
Drew Noakes drewnoakes.com
|
|
|
|
 |
 | Errors nikumi | 7:01 30 Jan '06 |
|
 |
Neither @OldCode or @OldCodeid are declared or defined in the code.
|
|
|
|
 |
 | More Details John Kendrick | 8:11 24 Jan '06 |
|
 |
I think if you could reformatted the code and elaborate on the two bullet items (table locking and tempdb), this article would be useful.
As it is, it sparked my interest to go lookup SET ROWCOUNT in the BOL and figure out if I should use the technique in my own SQL.
|
|
|
|
 |
|
 |
I really have to add to this comment, it would be really useful if you explained why this works, and what it does.
|
|
|
|
 |
|
|