Introduction
This article describes how to get more control over the XML output when using the FOR XML AUTO
statement. For instance, when adding XML tags. This is instead of using the more difficult to understand FOR XML EXPLICIT
statement. If you are about to deserialize an XML output for use in an application, you will perhaps find this information useful.
Background
In a FOR XML
clause, you normally use one of these modes:
When you want full control over the produced XML, you use FOR XML EXPLICIT
, but it's rather difficult to understand, read, and maintain the complex SELECT
statement at the end. FOR XML AUTO
produces the most readable SELECT
statement, but has the downside that you have minimal control over the produced XML. But with some tricks, and sometimes by using the PATH
option in addition, you can do more than you will expect. The RAW
option is rarely used and therefore not discussed. The PATH
option allows you to mix attributes and elements easier. For now, we use FOR XML AUTO
.
This example
For this example, we use two simple tables in a single database with a 1:N relationship. One table (SalesOrder) contains the orders along with customer information, and the other table (Items) contains the items. An order can have multiple items, and an item always belongs to one single order.
Code
I'd always like to start easy. So:
SELECT * FROM salesorder
produces::
ordernumber customername customerstreet
----------- ------------ --------------
1 parker first av
2 lesley sec av
If we want the resultset to be XML, we add the FOR XML AUTO
statement:
SELECT * FROM salesorder FOR XML AUTO
which produces::
<salesorder ordernumber="1" customername="parker" customerstreet="first av"/>
<salesorder ordernumber="2" customername="lesley" customerstreet="sec av"/>
But now, you have the fields as attributes. Most of the time, you will want them as elements. To do this, you add the ELEMENTS
parameter:
SELECT * FROM salesorder FOR XML AUTO, ELEMENTS
Which produces::
<salesorder>
<ordernumber>1</ordernumber>
<customername>parker</customername>>
<customerstreet>first av</customerstreet>
</salesorder>
<salesorder>
<ordernumber>2</ordernumber>
<customername>lesley</customername>
<customerstreet>sec av</customerstreet>
</salesorder>
If you want to change the 'salesorder
' tag, you can use:
SELECT * FROM salesorder AS niceorder FOR XML AUTO, ELEMENTS
Which produces:
<niceorder>
<ordernumber>1</ordernumber>
<customername>parker</customername>>
<customerstreet>first av</customerstreet>
</niceorder>
<niceorder>
<ordernumber>2</ordernumber>
<customername>lesley</customername>
<customerstreet>sec av</customerstreet>
</niceorder>
And of course, this trick also works for column names:
SELECT ordernumber AS order_no FROM salesorder WHERE ordernumber = 1 FOR XML AUTO, ELEMENTS
Which produces:
<salesorder>
<order_no>1</order_no >
</salesorder>
What if you want to add additional tags or add nodes? For instance, add 'customer' tags around customer information? This proves to be difficult with FOR XML AUTO
. A possible solution is a SELF JOIN
(join to the same table), but I found something easier. After much fiddling and tinkering, we use a subquery and slightly abuse the FOR XML PATH
command, like this:
SELECT
ordernumber,
(SELECT customername ,
customerstreet FOR XML PATH(''),
TYPE, ELEMENTS)
as customer
FROM
salesorder
FOR XML AUTO, ELEMENTS
Which produces:
<salesorder>
<ordernumber>1</ordernumber>
<customer>
<customername>parker</customername>
<customerstreet>first av</customerstreet>
</customer>
</salesorder>
<salesorder>
<ordernumber>2</ordernumber>
<customer>
<customername>lesley</customername>
<customerstreet>sec av</customerstreet>
</customer>
</salesorder>
Notice the use of the additional 'TYPE
' parameter. This will ensure the result of the subquery is returned as an XML type (as part of the the whole XML type result), rather than the NVARCHAR(MAX)
type. This neat little trick also works if you want to add surrounding tags to your complete result:
SELECT(
SELECT
customername
FROM
salesorder
FOR XML AUTO, TYPE, ELEMENTS
) AS orderrequest FOR XML PATH(''), TYPE, ELEMENTS
Which produces:
<orderrequest>
<salesorder>
<customername>parker</customername>
</salesorder>
<salesorder>
<customername>lesley</customername>
</salesorder>
</orderrequest>
Why don't we use the standard FOR XML AUTO
in the subquery? Try it, it will generate an error. You can only use FOR XML AUTO
in a subquery when the subquery is a query on an actual table (which the above obviously isn't).
The subquery construction is the way to go if you want full control over the produced XML output. Let's say we want, per order, the customer name and all the items belonging to the order. For this, you use correlated subqueries like this:
SELECT
customername ,
(SELECT * FROM item WHERE item.ordernumber =
salesorder.ordernumber FOR XML AUTO, TYPE, ELEMENTS)
FROM
salesorder
FOR XML AUTO, ELEMENTS
Which produces:
<salesorder>
<customername>parker</customername>
<item>
<itemnumber>10</itemnumber>
<description>pen</description>
<ordernumber>1</ordernumber>
</item>
<item>
<itemnumber>11</itemnumber>
<description>paper</description>
<ordernumber>1</ordernumber>
</item>
</salesorder>
When using correlated subqueries, you can use the regular FOR XML AUTO, TYPE, ELEMENTS
statement. If you want surrounding tags around the 'items', just add AS
after the subquery, like this:
SELECT
customername ,
(SELECT * FROM item WHERE item.ordernumber =
salesorder.ordernumber FOR XML AUTO, TYPE, ELEMENTS)
AS orderitems
FROM
salesorder
FOR XML AUTO, ELEMENTS
Which produces:
<salesorder>
<customername>parker</customername>
<orderitems>
<item>
<itemnumber>10</itemnumber>
<description>pen</description>
<ordernumber>1</ordernumber>
</item>
<item>
<itemnumber>11</itemnumber>
<description>paper</description>
<ordernumber>1</ordernumber>
</item>
</orderitems>
</salesorder>
Why don't we just simply join the item table to the order table? This will sometimes lead to unwanted and unpredictable behaviour when it comes to the layout of the produced XML. For instance:
SELECT
item.description,
salesorder.customername
FROM
salesorder
INNER JOIN item ON item.ordernumber = salesorder.ordernumber
FOR XML AUTO, ELEMENTS
Will produce garbage like this:
<item>
<description>pen</description>
<salesorder>
<customername>parker</customername>
</salesorder>
</item>
<item>
<description>paper</description>
<salesorder>
<customername>parker</customername>
</salesorder>
</item>
Also, you will have a hard time adding surrounding tags when using joins. So, for most controls, when using FOR XML AUTO
, use (correlated) subqueries.