Click here to Skip to main content
14,271,606 members

Comparing rows in SQL Server and Azure SQL Database

Rate this:
0.00 (No votes)
Please Sign up or sign in to vote.
0.00 (No votes)
16 Jun 2016CPOL
In this article is shown how to easily compare two rows in SQL Server using new JSON functionalities.

Introduction

In this article, I will show you how to easily compare two rows in Azure SQL Database and SQL Server 2016. The goal is to create query that will take ids of two records and show what columns are different and what are different values. You probably have some database tools that compare schema and content of tables, but if you need to show differences in some application, then it might be tricky to add this logic.

Background

Imagine this problem - we have an online shop where we are selling products. Users are coming to our web site and want to compare product (like price, speed, etc.) something like:

Image 1
 
This is very common requirement, but sometime it might be hard to pivot table rows vertically and compare them using SQL queries without some complex pivots. In this article we will see one approach how to compare cells of two rows.

Comparing products using TSQL

The easiest way to compare two products would be to select two rows from a table and show them: 

SELECT * FROM Production.Product
WHERE ProductID IN (3,4)


However, this might be hard to compare visually because Product table might have many columns:

ProductID Name ProductNumber MakeFlag FinishedGoodsFlag Color SafetyStockLevel ReorderPoint StandardCost ListPrice Size SizeUnitMeasureCode WeightUnitMeasureCode Weight DaysToManufacture ProductLine Class Style ProductSubcategoryID ProductModelID SellStartDate SellEndDate DiscontinuedDate ModifiedDate

3

BB Ball Bearing

BE-2349

1

0

NULL

800

600

0.00

0.00

NULL

NULL

NULL

NULL

1

NULL

NULL

NULL

NULL

NULL

2008-04-30 00:00:00.000

NULL

NULL

2014-02-08 10:01:36.827

4

Headset Ball Bearings

BE-2908

0

0

Black

800

600

0.00

0.00

NULL

NULL

NULL

NULL

0

NULL

NULL

NULL

NULL

NULL

2008-04-30 00:00:00.000

NULL

NULL

2014-02-08 10:01:36.827

Would it be easier to compare them if we can return something like vertically organized table that has only differences:
 

Field Product 1 Product 2

ProductID

3

4

Name

BB Ball Bearing

Headset Ball Bearings

ProductNumber

BE-2349

BE-2908

MakeFlag

True

False

DaysToManufacture

1

0

Ok, let’s see the solution. I will create one table-value function that accepts two ids and returns a table with three columns – columnname from Product table, value from the first product row, and value from the second product row:

DROP FUNCTION IF EXISTS
Production.CompareProducts
GO
CREATE FUNCTION
Production.CompareProducts (@id1 int, @id2 int)
returns table
as
return (
    select v1.[key] as Field, v1.value as [Product 1], v2.value as [Product 2]
    from
        openjson(
            (select * from Production.Product where ProductID = @id1
             for json path, include_null_values, without_array_wrapper)) v1
    inner loop join
        openjson(
            (select * from Production.Product where ProductID = @id2
             for json path, include_null_values, without_array_wrapper)) v2
        on v1.[key] = v2.[key]
    where v1.value <> v2.value
    or v1.value is null and v2.value is not null
    or v1.value is not null and v2.value is null
)

This code uses new JSON functions that are added in Azure SQL Database and SQL Server 2016. Details are explained below.
Note: You would need to have compatibility level 130 on the database because OPENJSON function cannot work in lower compatibility levels.
If you create this function in AdventureWorks database on the latest SQL Server 2016 or Azure SQL Database, you would be able to compare products using the following code:

select * from Production.CompareProducts(3,4)

This function works with Production.Product table, but you can easily rewrite it to work with any other table. Below is “templetized” version of the function where you just need to put name of the table <<TABLE>> and primary key column <<KEY>>:

DROP FUNCTION IF EXISTS
Compare<<TABLE>>
GO
CREATE FUNCTION
Compare<<TABLE>> (@id1 int, @id2 int)
returns table
as
return (
    select v1.[key] as Field, v1.value as [<<TABLE>> 1], v2.value as [<<TABLE>> 2]
    from
        openjson(
            (select * from <<TABLE>> where <<KEY>> = @id1
                      for json path, include_null_values, without_array_wrapper)) v1
    inner loop join
        openjson(
            (select * from <<TABLE>> where <<KEY>> = @id2
                      for json path, include_null_values, without_array_wrapper)) v2
        on v1.[key] = v2.[key]
    where v1.value <> v2.value
    or v1.value is null and v2.value is not null
    or v1.value is not null and v2.value is null
)

 

If this function can help you, you can use it without understanding underlying JSON functions. If you want to understand how it works then you can continue reading the story about JSON functions in Azure SQL Database.

JSON in Azure SQL Database/SQL Server 2016

Azure SQL Database/SQL Server 2016 provide new functions and operators that enable you to format SQL results as JSON text, and to parse JSON into table format. Two main functionalities are:

  1. FOR JSON clause can be added after any SELECT statement and it will return results of SQL query as JSON text.
  2. OPENJSON table value function that parses JSON text and returns table with key:value pairs.

In the following sections we will shortly see how these features work. If you want more details, I can recommend my previous article: Friday the 13th - JSON is coming to SQL Server.

Formatting table data as JSON

FOR JSON clause is used to output results of SQL query as JSON text. As an example, if we read Product table and you want to get JSON test as a results, you can write the following query:

select top 5 ProductID, Name, Color
from Production.Product
FOR JSON PATH

As a result, instead of the set of table rows, you would get something like the following JSON:

[
 {"ProductID":1,"Name":"Adjustable Race"},
 {"ProductID":2,"Name":"Bearing Ball"},
 {"ProductID":3,"Name":"BB Ball Bearing"},
 {"ProductID":4,"Name":"Headset Ball Bearings","Color":"Black"},
 {"ProductID":316,"Name":"Blade"}
]

FOR JSON will transform result set to a JSON array, each row will be generated as JSON object in the array, and each cell:value will be generated as JSON key:value pair in the object.
FOR  JSON might be used to return results of SQL queries to some REST services that returns JSON - see example in Building REST services with ASP.NET Core Web API and Azure SQL Database.

Including null values

By default, SQL Database will suppress all NULL values from JSON output. In the previous example, you might notice that only ProductId 4 has Color key because other product have null values in that cell. If you want to have all key:values, you can include INCLUDE_NULL_VALUES option and FOR JSON will generate “column name”:null if underlying cell has NULL value, e.g.:

select top 5 ProductID, Name, Color
from Production.Product
FOR JSON PATH, INCLUDE_NULL_VALUES

Result of this query might look like:

[
 {"ProductID":1,"Name":"Adjustable Race","Color":null},
 {"ProductID":2,"Name":"Bearing Ball","Color":null},
 {"ProductID":3,"Name":"BB Ball Bearing","Color":null},
 {"ProductID":4,"Name":"Headset Ball Bearings","Color":"Black"},
 {"ProductID":316,"Name":"Blade","Color":null}
]

As you might notice, every JSON object has "Color":null property. I need to use this option in the Compare function because I need to compare nulls with other values.

Removing array wrapper

FOR JSON will generate JSON array as result. However, if you need to return a single row as JSON object, you probably don’t want to get an array with a single element. In order to remove array brackets around the returned JSON, you can specify WITHOUT_ARRAY_WRAPPER option:

select ProductID, Name, Color from Product
where ProductId = 1
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER

Result of this query might look like:

{"ProductID":1,"Name":"Adjustable Race"}

You can also combine INCLUDE_NULL_VALUES and WITHOUT_ARRAY_WRAPPER options in the same query. If you add both options, result might be:

{"ProductID":1,"Name":"Adjustable Race","Color":null}

FOR JSON enables you to transform your query results as JSON text and customize format. now we will see reverse operaiton - OPENJSON.

Parsing JSON

In Azure SQL Database and SQL Server 2016 you can use new OPENJSON function that parses JSON text and transforms it to table format. You can read rows from the result of OPENJSON function using standard SQL language.

set @json = '{"ProductID":4,"Name":"Headset Ball Bearings","ProductNumber":"BE-2908","Color":"Black"}'

select [key], value
from OPENJSON(@json)

OPENJSON will return all key: value pairs from the input JSON text in as a table with columns [key] and value: 

key value
ProductID 4
Name Headset Ball Bearings
ProductNumber BE-2908
Color

Black

This is standard table result that can be filtered or transformed using any standard T-SQL query.

How to use JSON functions to compare products?

Now we will see step-by-step explanation how to use these function to solve the comparison problem. FOR JSON clause enables you to transform single row to set of key:value pairs:

SELECT *
FROM Production.Product
WHERE ProductID = 3
FOR JSON PATH, INCLUDE_NULL_VALUES, WITHOUT_ARRAY_WRAPPER

I’m using INCLUDE_NULL_VALUES and WITHOUT_ARRAY_WRAPPER options because I need all values (including nulls) and I need single JSON object.
We are getting set of key:value pairs where each key is a column-name and each value is cell from that row. 

{"ProductID":3,"Name":"BB Ball Bearing","ProductNumber":"BE-2349","MakeFlag":true,"FinishedGoodsFlag":false,"Color":null,"SafetyStockLevel":800,"ReorderPoint":600,"StandardCost":0.0000,"ListPrice":0.0000,"Size":null,"SizeUnitMeasureCode":null,"WeightUnitMeasureCode":null,"Weight":null,"DaysToManufacture":1,"ProductLine":null,"Class":null,"Style":null,"ProductSubcategoryID":null,"ProductModelID":null,"SellStartDate":"2008-04-30T00:00:00","SellEndDate":null,"DiscontinuedDate":null,"rowguid":"9C21AED2-5BFA-4F18-BCB8-F11638DC2E4E","ModifiedDate":"2014-02-08T10:01:36.827"}

Now what will happen if we send this JSON text to OPENJSON?

SELECT [key], value
FROM OPENJSON ( (SELECT *
                 FROM Production.Product
                 WHERE ProductID = 3
                 FOR JSON PATH, INCLUDE_NULL_VALUES, WITHOUT_ARRAY_WRAPPER) ) 

OPENJSON will parse JSON text generated by FOR JSON clause and return a table with (key,value) structure:

key value
ProductID 4
Name Headset Ball Bearings
ProductNumber BE-2908
Color

Black

... ...

If you compare this result with the original SQL statement that select data from a single row, would will see that we got transposed table (column names and values are organized vertically). If we use this code twice on two different rows, we will get two vertical, transposed tables.
Now, we can join these two-column tables by [key] column and match key:value pairs from these two tables by [key] column:

SELECT p1.[key], p1.value, p2.value
FROM OPENJSON ( (SELECT *
                        FROM Production.Product
                        WHERE ProductID = 3
                        FOR JSON PATH, INCLUDE_NULL_VALUES, WITHOUT_ARRAY_WRAPPER) ) p1
INNER LOOP JOIN
    OPENJSON ( (SELECT *
                        FROM Production.Product
                        WHERE ProductID = 4
                        FOR JSON PATH, INCLUDE_NULL_VALUES, WITHOUT_ARRAY_WRAPPER) ) p2
ON p1.[key] = p2.[key]

This query will join all rows from first OPENJSON with rows from second OPENJSON by column names ([key] column) and return something like:

key

value

value

ProductID

3

4

Name

BB Ball Bearing

Headset Ball Bearings

ProductNumber

BE-2349

BE-2908

MakeFlag

true

false

FinishedGoodsFlag

false

false

Color

NULL

Black

SafetyStockLevel

800

800

ReorderPoint

600

600

StandardCost

0.0000

0.0000

ListPrice

0.0000

0.0000

Now we have all matching rows properly matched so we just need to add where clause that removes values that are same.
Here is another trick – we cannot just say p1.value <> p2.value  to return differences! This condition will not work if one of the values is NULL. When SQL Server compared NULL an dvalue it does nto return false. Instead it returns UNKNOWN value and this row will not be shown in the result. This is the reason why we need more complex condition that will check are the values different or some of the value is null:

where v1.value <> v2.value
or v1.value is null and v2.value is not null
or v1.value is not null and v2.value is null

Finally we just need to generalize this query, replace constants 3 and 4 with parameters @id1 and @id2, wrap it as table value function and we have a function that compares products.

Conclusion

JSON funtion in Azure SQL Database and new SQL server 2016 enables you to easily transform relational data to JSON and vice versa. As one side efect, you can use them to easily serialize and deserialize results of the queries.

This was one unusual usage of JSON functionalities in Azure SQL Database, but I hope that it might help you.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Jovan Popovic(MSFT)
Program Manager Microsoft
Serbia Serbia
Graduated from Faculty of Electrical Engineering, Department of Computer Techniques and Informatics, University of Belgrade, Serbia.
Currently working in Microsoft as Program Manager on SQL Server product.
Member of JQuery community - created few popular plugins (four popular JQuery DataTables add-ins and loadJSON template engine).
Interests: Web and databases, Software engineering process(estimation and standardization), mobile and business intelligence platforms.

Comments and Discussions

 
-- There are no messages in this forum --
Article
Posted 16 Jun 2016

Stats

6.3K views
2 bookmarked