Stored procedures and multiple result sets in T-SQL
In T-SQL it’s just impossible to access multiple result sets if stored procedure returns them.
In the life of every software engineer, there should be at least (I’d love to say “at most" actually) one project where bussiness logic is implemented on database side. My current project is one of that kind. We have all logic designed as stored procedures, then there’s a middleware where we use LINQ2SQL for database API bindings, then there are iPhone application and a web-site.
Well, it works. But it’s a hell.
First of all, T-SQL itself is a very primitive low-level language. There are just no features to make components reusable. So, as long as your procedures are just couple of select
statements, everything’s fine. But the more execution flows you have to implement, the more unreadable your procedure becomes. I do firmly believe that being extremely pedantic can make things better, but it doesn’t solve the problem completely. It’s just fine to make mistakes when you write a program. In most cases you have an easy way to cover your code with automated tests, at least it’s pretty straighforward in high-level languages. With T-SQL, you just can’t.
In our project, I had to take testing tasks as one of my concerns. I finally came out with idea that the easiest approach technically is just to test the middleware+database together. Test suite is just another client like iPhone application or web-site. This approach proved its adequacy, but the problem is, when Teamcity says things like Test "CanTransferLotsOfMoneys" failed
, database guys have no idea what it means.
So, for a long time I’ve been trying to find out a solution to write tests for stored procedure in pure T-SQL. In this case, I could have gotten rid of being the only person to intepret test failure. The major point was, in T-SQL it’s just impossible to access multiple result sets if stored procedure returns them. In our project, there are 2 or may be 3 procedures, which only return 1 result set. That’s the issue.
Always return only one result set
Sure this will never work. Simple example is, I need to get data to render a blog post page. I’ll need:
- Current user’s details
- Post details
- Comments and their details
There are at least 3 result sets.
Use temporary tables
This will probably work, but what do you do if some procedures call themselves recursively?
Table variables
Table variables is a great idea:
declare @Users table( UserId int, UserName nvarchar(256) ) exec DoSomething @Users
But unfortunately stored procedures can’t return them.
I finally discovered that T-SQL has features to work with XML. The solution consists of few simple ideas.
- There’s a type
xml
in T-SQL. This type allows storing XML documents and accessing their nodes. - Objects of this type can be either converted to
nvarchar()
or constructed from it. - Objects of this type can be passed to stored procedures as output parameters.
- You can easily generate XML from
select
queries.
Let’s say we have 2 tables:
create table Blog( BlogId int identity(1,1) not null, BlogName nvarchar(256) ) create table Post( PostId int identity(1,1) not null, BlogId int not null, -- yes, this should be FK BlogName nvarchar(256) )
The task is:
- Create a testable stored procedure
GetBlogWithDetails @blogId
, that will returnBlogId
,BlogName
and all its posts. - Definition of “testable" is: I can access all the data returned and based on this data make some conclusions.
Normally, this procedure would look like this: (no error handling - sorry)
create procedure GetBlogWithDetails(@BlogId int) as begin select BlogId, BlogName from Blog where BlogId = @BlogId select PostId, BlogId, PostName from Post where BlogId = @BlogId end
You can’t access its second result set. The solution is to render same data as XML and return this XML as output parameter.
create procedure GetBlogWithDetailsXml( @blogId int, @xml xml output) as begin -- first, create 2 table variables for blogs and for posts declare @blogRows table(BlogId int, BlogName nvarchar(256)) declare @postRows table(PostId int, BlogId int, PostName nvarchar(256)) -- (#1) this is where you implement your useful logic insert into @blogRows select BlogId, BlogName from Blog where BlogId = @blogId insert into @postRows select PostId, BlogId, PostName from Post where BlogId = @blogId -- here you render your 2 tables containing useful data as XML declare @blogRowsXml xml = ( select BlogId, BlogName from @blogRows as Blog for xml auto) declare @postRowsXml xml = ( select PostId, BlogId, PostName from @postRows as Post for xml auto) -- here you build a single XML with all the data required set @xml = cast(@blogRowsXml as nvarchar(max)) + cast(@postRowsXml as nvarchar(max)) -- (#2) here you return the data as "raw" result sets. select * from @blogRows select * from @postRows end
As you see, “useful" code is only at #1 and #2. The rest is bunch of stuff for XML. You can now play with GetBlogWithDetailsXml
like this:
For my test data,
declare @xml xml exec GetBlogWithDetailsXml 1, @xml output select @xml
returns XML like this:
<Blog BlogId="1" BlogName="Blog #1" /> <Post PostId="1" BlogId="1" PostName="Post #1 in Blog #1" /> <Post PostId="2" BlogId="1" PostName="Post #2 in Blog #1" />
It’s now pretty easy to extract all the data returned:
select Col.value('@BlogId', 'int') as BlogId, Col.value('@BlogName', 'nvarchar(256)') as BlogName from @xml.nodes('/Blog') as Data(Col) select Col.value('@PostId', 'int') as PostId, Col.value('@BlogId', 'int') as BlogId, Col.value('@PostName', 'nvarchar(256)') as PostName from @xml.nodes('/Post') as Data(Col)
with my test data, it returns:
BlogId | BlogName |
---|---|
1 | Blog #1 |
PostId | BlogId | PostName |
---|---|---|
1 | 1 | Post #1 in Blog #1 |
2 | 1 | Post #2 in Blog #1 |
I’m not really sure if returning “raw" result sets even makes sense with this approach. From the viewpoint of LINQ2SQL, yes, you won’t be able to utilize its mapping feature, but deserializing object from XML is a primitive task, so it shouldn’t be an issue.