Click here to Skip to main content
6,629,377 members and growing! (22,769 online)
Email Password   helpLost your password?
License: The Code Project Open License (CPOL)

SQL. Fast GUID List Generator.

By Anton Burtsev

The article shows how one can use MS SQL Server to generate millions of guids in seconds.
WCF
Version:4 (See All)
Posted:22 Jun 2009
Updated:27 Jun 2009
Views:3,545
Bookmarked:12 times
Technical Blog
Announcements
Loading...
 
Search    
Advanced Search
Add to IE Search
printPrint   add Share
      Discuss Discuss   Broken Article?Report  
4 votes for this technical blog.
Popularity: 3.01 Rating: 5.00 out of 5

1

2

3

4
4 votes, 100.0%
5
A Technical Blog article. View entire blog here.

Introduction

If you want your database to be fast with huge amount of data, you should test it with large test database. To create large test database you may need many guids for row ids. If you use MS SQL Server for this purpose (as I do) you may want to boost performance of your generator. With this article one can create millions of guids in seconds.

Obvious Solution

If you want to create a list of guids and put them into a SQL table you may write the following:

--METHOD 1

create table ids(id uniqueidentifier);

declare @guidCount int = 10000;
declare @n int = 0;
while @n < @guidCount
begin
   insert ids values(NEWID());
   set @n = @n + 1;
end;

This code creates only 10 000 guids. But it takes about 1000 ms. Why it is so slow? There are three reasons.

Reason #1. Number of Transactions

Each time INSERT is executed, MS SQL Server starts and commits a transaction. This is not necessary in the case of our task. So we can collect our guids in memory and then insert all guids in single INSERT statement or we can manually start and commit a transaction.
First look how we can collect guids into the memory:

--METHOD 2.1

create table ids(id uniqueidentifier);

declare @ids table(id uniqueidentifier default NEWID());
declare @guidCount int = 100000;
declare @n int = 0;
while @n < @guidCount
begin
   insert @ids default values;
   set @n = @n + 1;
end;

insert ids
select * from @ids;

This code creates 100 000 guids and it takes about 2000 ms. 5 times faster than in first code block where transaction is committed for each guid. Now look how manual transaction affects the performance:

--METHOD 2.2

create table ids(id uniqueidentifier);

begin transaction;

declare @guidCount int = 100000;
declare @n int = 0;
while @n < @guidCount
begin
   insert ids values(NEWID());
   set @n = @n + 1;
end;

commit transaction;

This code also creates 100 000 guids and it takes about 1300 ms. So it 1,5 times faster than generation in memory and 7 times faster than the first simple generator.
Great result, but what if you need even more guids? Than we have to review the next reason for slowness of all code block mentioned above.

Reason #2. Number of INSERTs

Each time INSERT is executed there are many wasteful things to be done: compile statement, build (or get from cache) query plan, check parameters, finalize allocated objects, etc. So we need to minimize total count of INSERT executions. We really need only a fast way to generate a rowset which size is equal to number of guids we need to create. If we get one, we just select a new guid for each row and save them into a table. But how we can get a rowset without INSERTs? We can use table cross to get the rowset with required size. If we CROSS JOIN some table (say "counter") onto itself we get squared number of rows. So If we generate a 1000-row-couter we get 1 000 000 rows in result rowset. Now apply this approach to the guid generator:

--METHOD 3

create table ids(id uniqueidentifier);

declare @ids table(id int identity)
declare @guidCount int = 1000000;
declare @n int = 0;
while @n < SQRT(@guidCount)
begin
   insert @ids default values;
   set @n = @n + 1;
end;

insert ids
select NEWID() from @ids i1, @ids i2;

This code creates 1 000 000 guids and it takes only about 3000 ms This is 30 times faster than first method and 5-7 times faster than both second ones.

Great result again! But there is a way to get 1 000 000 guids more efficient.

Reason #3. Number of Locks

When INSERT statement is being executed MS SQL Server is locking new data row by row or page by page thinking that the table might be accessed by concurrent query. If you are generating test database there is no concurrent queries. So we may direct MS SQL Server to lock whole table rather than page by page locking. Table hint WITH(TABLOCK) locks whole table at the beginning of execution.

--METHOD 3 + WITH(TABLOCK)

create table ids(id uniqueidentifier);

declare @ids table(id int identity)
declare @guidCount int = 1000000;
declare @n int = 0;
while @n < SQRT(@guidCount)
begin
   insert @ids default values;
   set @n = @n + 1;
end

insert ids with(tablock)
select NEWID() from @ids i1, @ids i2;

This code generates 1 000 000 guids in 500 ms! That's really great result. So one can see that this is very important to obtain a table lock before heavy INSERT execution. If you want to create even more guids - WITH TABLOCK is the only solution. In my test environment (default MS SQL 2008 settings) I can’t generate more than 10 millions guids without table lock. I get an error:
Msg 1204, Level 19, State 4, Line 10
The instance of the SQL Server Database Engine cannot obtain a LOCK resource at this time. Rerun your statement when there are fewer active users. Ask the database administrator to check the lock and memory configuration for this instance, or to check for long-running transactions.

Summary

Method No GUIDs count Time Elapsed, ms
1 10 000 1000
2.1 100 000 2000
2.2 100 000 1300
3 1 000 000 3000
3 + tablock 1 000 000 500

My Test Environment

Test environment: MS SQL 2008 (default settings) on Core 2 Duo 3GHz processor and WD Raptor hard disk.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Anton Burtsev


Member

Occupation: Web Developer
Location: Russian Federation Russian Federation

Other popular Database articles:

Article Top
You must Sign In to use this message board.
FAQ FAQ 
 
Noise Tolerance  Layout  Per page   
 Msgs 1 to 5 of 5 (Total in Forum: 5) (Refresh)FirstPrevNext
GeneralTry method 4 PinmemberTrent Tobler10:23 28 Jul '09  
GeneralEfficient mass inserts Pinmembersupercat99:51 29 Jun '09  
GeneralRe: Efficient mass inserts PinassociateAnton Burtsev3:51 20 Jul '09  
GeneralRe: Efficient mass inserts Pinmembersupercat95:39 20 Jul '09  
GeneralRe: Efficient mass inserts PinassociateAnton Burtsev21:59 20 Jul '09  

General General    News News    Question Question    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

PermaLink | Privacy | Terms of Use
Last Updated: 27 Jun 2009
Editor:
Copyright 2009 by Anton Burtsev
Everything else Copyright © CodeProject, 1999-2009
Web17 | Advertise on the Code Project