Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server

Controlling the XML output when using the FOR XML AUTO statement in SQL Server

4.91/5 (18 votes)
26 Jan 2010CPOL3 min read 117.9K  
Describes how to get more control over the XML output when using the FOR XML AUTO statement. For instance, when adding XML tags.

Introduction

This article describes how to get more control over the XML output when using the FOR XML AUTO statement. For instance, when adding XML tags. This is instead of using the more difficult to understand FOR XML EXPLICIT statement. If you are about to deserialize an XML output for use in an application, you will perhaps find this information useful.

Background

In a FOR XML clause, you normally use one of these modes:

  • RAW
  • AUTO
  • EXPLICIT
  • PATH

When you want full control over the produced XML, you use FOR XML EXPLICIT, but it's rather difficult to understand, read, and maintain the complex SELECT statement at the end. FOR XML AUTO produces the most readable SELECT statement, but has the downside that you have minimal control over the produced XML. But with some tricks, and sometimes by using the PATH option in addition, you can do more than you will expect. The RAW option is rarely used and therefore not discussed. The PATH option allows you to mix attributes and elements easier. For now, we use FOR XML AUTO.

This example

For this example, we use two simple tables in a single database with a 1:N relationship. One table (SalesOrder) contains the orders along with customer information, and the other table (Items) contains the items. An order can have multiple items, and an item always belongs to one single order.

Image 1

Code

I'd always like to start easy. So:

SQL
SELECT * FROM salesorder

produces::

ordernumber customername customerstreet
----------- ------------ --------------
1           parker       first av  
2           lesley       sec av

If we want the resultset to be XML, we add the FOR XML AUTO statement:

SQL
SELECT * FROM salesorder FOR XML AUTO

which produces::

XML
<salesorder ordernumber="1" customername="parker" customerstreet="first av"/>
<salesorder ordernumber="2" customername="lesley" customerstreet="sec av"/>

But now, you have the fields as attributes. Most of the time, you will want them as elements. To do this, you add the ELEMENTS parameter:

SQL
SELECT * FROM salesorder FOR XML AUTO, ELEMENTS

Which produces::

XML
<salesorder>
    <ordernumber>1</ordernumber>
    <customername>parker</customername>>
    <customerstreet>first av</customerstreet>
</salesorder>
<salesorder>
    <ordernumber>2</ordernumber>
    <customername>lesley</customername>
    <customerstreet>sec av</customerstreet>
</salesorder>

If you want to change the 'salesorder' tag, you can use:

SQL
SELECT * FROM salesorder AS niceorder FOR XML AUTO, ELEMENTS

Which produces:

XML
<niceorder>
    <ordernumber>1</ordernumber>
    <customername>parker</customername>>
    <customerstreet>first av</customerstreet>
</niceorder>
<niceorder>
    <ordernumber>2</ordernumber>
    <customername>lesley</customername>
    <customerstreet>sec av</customerstreet>
</niceorder>

And of course, this trick also works for column names:

SQL
SELECT ordernumber AS order_no FROM salesorder WHERE ordernumber = 1 FOR XML AUTO, ELEMENTS

Which produces:

XML
<salesorder>
    <order_no>1</order_no >
</salesorder>

What if you want to add additional tags or add nodes? For instance, add 'customer' tags around customer information? This proves to be difficult with FOR XML AUTO. A possible solution is a SELF JOIN (join to the same table), but I found something easier. After much fiddling and tinkering, we use a subquery and slightly abuse the FOR XML PATH command, like this:

SQL
SELECT 
    ordernumber,
    (SELECT    customername , 
            customerstreet FOR XML PATH(''), 
            TYPE, ELEMENTS) 
            as customer
FROM 
    salesorder 
FOR XML AUTO, ELEMENTS

Which produces:

XML
<salesorder>
    <ordernumber>1</ordernumber>
    <customer>
        <customername>parker</customername>
        <customerstreet>first av</customerstreet>
    </customer>
</salesorder>
<salesorder>
    <ordernumber>2</ordernumber>
    <customer>
        <customername>lesley</customername>
        <customerstreet>sec av</customerstreet>
    </customer>
</salesorder>

Notice the use of the additional 'TYPE' parameter. This will ensure the result of the subquery is returned as an XML type (as part of the the whole XML type result), rather than the NVARCHAR(MAX) type. This neat little trick also works if you want to add surrounding tags to your complete result:

SQL
SELECT(
    SELECT 
        customername
    FROM 
        salesorder 
    FOR XML AUTO, TYPE, ELEMENTS
) AS orderrequest FOR XML PATH(''), TYPE, ELEMENTS

Which produces:

XML
<orderrequest>
    <salesorder>
        <customername>parker</customername>
    </salesorder>
    <salesorder>
        <customername>lesley</customername>
    </salesorder>
</orderrequest>

Why don't we use the standard FOR XML AUTO in the subquery? Try it, it will generate an error. You can only use FOR XML AUTO in a subquery when the subquery is a query on an actual table (which the above obviously isn't).

The subquery construction is the way to go if you want full control over the produced XML output. Let's say we want, per order, the customer name and all the items belonging to the order. For this, you use correlated subqueries like this:

SQL
SELECT
    customername ,
    (SELECT    * FROM item WHERE item.ordernumber = 
       salesorder.ordernumber FOR XML AUTO, TYPE, ELEMENTS)
FROM
    salesorder
FOR XML AUTO, ELEMENTS

Which produces:

XML
<salesorder>
    <customername>parker</customername>
    <item>
        <itemnumber>10</itemnumber>
        <description>pen</description>
        <ordernumber>1</ordernumber>
    </item>
    <item>
        <itemnumber>11</itemnumber>
        <description>paper</description>
        <ordernumber>1</ordernumber>
    </item>
</salesorder>

When using correlated subqueries, you can use the regular FOR XML AUTO, TYPE, ELEMENTS statement. If you want surrounding tags around the 'items', just add AS after the subquery, like this:

SQL
SELECT
    customername ,
    (SELECT    * FROM item WHERE item.ordernumber = 
      salesorder.ordernumber FOR XML AUTO, TYPE, ELEMENTS) 
      AS orderitems
FROM
    salesorder
FOR XML AUTO, ELEMENTS

Which produces:

XML
<salesorder>
    <customername>parker</customername>
    <orderitems>
        <item>
            <itemnumber>10</itemnumber>
            <description>pen</description>
            <ordernumber>1</ordernumber>
        </item>
        <item>
            <itemnumber>11</itemnumber>
            <description>paper</description>
            <ordernumber>1</ordernumber>
        </item>
    </orderitems>
</salesorder>

Why don't we just simply join the item table to the order table? This will sometimes lead to unwanted and unpredictable behaviour when it comes to the layout of the produced XML. For instance:

SQL
SELECT
    item.description,
    salesorder.customername
FROM
    salesorder
INNER JOIN item ON item.ordernumber = salesorder.ordernumber
FOR XML AUTO, ELEMENTS

Will produce garbage like this:

XML
<item>
    <description>pen</description>
    <salesorder>
        <customername>parker</customername>
    </salesorder>
</item>
<item>
    <description>paper</description>
    <salesorder>
        <customername>parker</customername>
    </salesorder>
</item>

Also, you will have a hard time adding surrounding tags when using joins. So, for most controls, when using FOR XML AUTO, use (correlated) subqueries.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)