Parse Transact SQL to Check Syntax






4.37/5 (13 votes)
How to implement a parse function within a C# WPF application
Introduction
The problem that I am going to address in this article is how to pass some Transact-SQL text to a specified SQL Server instance, and ask it to parse the code, returning any syntax errors. This would be useful in a case where your application allows the user to enter some T-SQL text to execute, or where T-SQL gets executed dynamically from script files, or whatever. The possibilities are endless, just bear in mind the security implications that this might have if this article inspires you to implement such a design.
Say, for example, you have a system which enables one user (with special access privileges, of course) to write T-SQL code and store it on the system in the form of scripts (in the database or in files). Then, another user of the system would come in and choose one of these scripts based on the name and description provided by the programmer, and then click a button to execute it. Obviously, you need some mechanism to check the validity of the code before allowing it to be stored on the system. This is where my solution would hopefully be useful.
Background
I did some research and decided to include some background on the inner workings of SQL Server, or any other DBMS for that matter. So what really happens when your applications execute queries on the database? Is there a specific process that the DBMS follows to return the requested data, or to update or delete a subset of data? What happens under the hood of your preferred DBMS is quite complicated, and I will only explain or mention key processes on a high level.
SQL Server is split into multiple components, and most of these components are grouped to form the Relational Engine and the Storage Engine. The Relational Engine is responsible for receiving, checking and compiling the code, and for managing the execution process, while the Storage Engine is responsible for retrieving, inserting, updating or deleting the underlying data in the database files. The component that I want to touch base with is the Query Processor, which is part of the Relational Engine.
As the name suggests, it is the Query Processor's job to prepare submitted SQL statements before it can be executed by the server. The Query Processor will go through three processes before it can provide an Execution Plan. This execution plan is the most optimal route chosen by the DBMS for servicing the query. The three processes mentioned include:
- Parse and Normalize
- Compile
- Optimize
The parser checks for syntax errors including correct spelling of keywords. The normalizer performs binding, which involves checking if the specified tables and columns exist, gathering meta data about the specified tables and columns, and performing some syntax optimizations. Programmers frequently use the term Compilation to refer to the compilation and optimization process. True compilation only affects special T-SQL statements such as variable declarations and assignments, loops, conditional processing, etc. These statements provide functionality to SQL code, but they do not form part of DML statements such as SELECT
, INSERT
, UPDATE
or DELETE
. On the other hand, only these DML statements need to be optimized. Optimization is, by far, the most complex process of the Query Processor. It employs an array of algorithms to first gather a sample of suitable execution plans, and then filters through them until the best candidate is chosen.
After the optimal execution plan is determined and returned by the Query Processor, it is stored in a cache. SQL Server will automatically determine how long to keep this execution plan within the cache as it might get reused often. When an application executes a query, SQL Server checks if an execution plan exists in the cache for the query. SQL Server generates a cache key based on the query text, and searches for the same key in the cache. Queries need to be recompiled and reoptimized when metadata changes such as column definitions or indexes, but not for changes in parameters, system memory, data in the data cache, etc.
Finally, the Query Processor communicates the execution plan to the Storage Engine and the query is executed.
Using the Code
I have created and included a simple code editor application, but keep in mind that the main purpose of this article is to provide you with a parse function, and only code snippets and notes revolving around this point will be covered here. There are lots of very useful articles out there for building WPF applications. I will assume that you have some experience with Visual Studio and C#. I have included the example app which was written in Visual C# Express 2010 as a WPF Application. Knowledge of WPF is not required as I will explain the relevant C# code in detail.
Essentially, I want my application to have an Execute button and a Parse button (like in MS SQL Server Management Studio). Pressing the Execute button, SQL Server will go through the whole process as described above to first prepare the statements and then determine the execution plan before it will be executed. For the parse button, naturally it should only parse the query. I am going to create a class that will encapsulate all my ADO.NET objects, and provide methods, Execute
and Parse for wiring functionality to the buttons. I would also provide methods for connecting and disconnecting to the SQL Server instance with a specified connection string. The class is called SqlHandler
.
This class encapsulates and hides the following objects:
SqlConnection conn
SqlCommand cmd
SqlDataAdapter adapter
List<SqlError> errors
You need to include using System.Data.SqlClient;
and using System.Data;
to the using
directives list at the top of the code file, as I am sure you know. The conn
object is used for connecting to the database. The ConnectionString
property directly gets and sets the conn.ConnectionString
property. This allows you to get or set the Connection string from outside the class. The cmd
object is used to execute commands, and adapter is used to obtain query results from the database. The object errors
is a generic list of type SqlError
. This list will be used to capture and return errors generated while executing or parsing T-SQL code.
Property FireInfoMessageEventOnUserErrors and Event InfoMessage
Most of you reading this article would already be familiar with these ADO.NET classes. Most of the time, I am developing applications with ADO.NET; I only use a few selected properties and methods. The SqlConnection
class contains a FireInfoMessageEventOnUserErrors
property and an InfoMessage
event that are less well known and less often used (in my opinion at least). I had to discover them myself by digging through the objects as I could not find a relevant article explaining how to accomplish what I wanted. Eventually, through trial and error, I got a working solution.
FireInfoMessageEventOnUserErrors
is a boolean property. When set to false
(default), the InfoMessage
event will not be fired when an error occurs, and an Exception will be raised by the ADO.NET API. When set to true
, an Exception will not be thrown, but the InfoMessage
event will be fired. For my code to work, I had to enable this event to catch all the messages through the SqlInfoMessageEventArgs
event argument object. The following code snippet shows how to set this property and event in the constructor:
conn.FireInfoMessageEventOnUserErrors = true;
conn.InfoMessage += new SqlInfoMessageEventHandler(conn_InfoMessage);
conn_InfoMessage
is the name of the event handler method that will be called when the event fires. It is important to note that although this looks like an asynchronous operation, it is in fact synchronous. This means that when the T-SQL query is executed by passing it to cmd.ExecuteNonQuery
or to adapter.Fill
, the event will be fired before continuing execution. This allows us to suck up all the messages into the errors
list before returning from the Execute
and Parse
methods of our class where ExecuteNonQuery
and Fill
is called. The snippet below describes how the messages are caught in the event handler.
private void conn_InfoMessage(object sender, SqlInfoMessageEventArgs e)
{
//ensure that all errors are caught
SqlError[] errorsFound = new SqlError[e.Errors.Count];
e.Errors.CopyTo(errorsFound, 0);
errors.AddRange(errorsFound);
}
It is important to mention that the event will be fired for every error that the T-SQL script might contain. For instance, if your script contains two errors, the conn_InfoMessage
event handler will be called twice! I only discovered this while testing my application where I tried to parse a script containing multiple errors. The initial result was that my Parse
method always returned only one error, while SSMS reported the correct amount of errors for the same script. Only when I inserted a message box in the event handler, I discovered how it works. The reason why this was misleading is because the second argument of our event handler, the e
object, which is of type SqlInfoMessageEventArgs
has an Errors
property. This property is of type SqlErrorCollection
, which to me implied that it contains multiple SqlError
objects. Naturally, I assumed that this collection will contain all the errors at once. After a few code modifications, I got the desired result. What happens now is that every time the event is fired, an SqlError
array is created and the e.Errors
collection of SqlError
objects will be copied to this array. Even though this collection contained exactly one item every time I have tested my code, I make sure that all the SqlError
objects are captured just to be safe. This whole array is then copied to the errors
list, which is a private
field within my class definition. This list is used to aggregate all the errors before returning it to the client code. Another point worth mentioning is that the errors
list has to be cleared every time Parse
or Execute
is called.
Execute Method
The first parameter of this method, sqlText
contains the T-SQL code to be executed. The second parameter is an SqlError
array object. Take notice of the out
keyword. This means that the parameter is an out
parameter, and we have to set its value somewhere in the method. This allows the method to return both a DataTable
object (through the normal return type and return
statement), and an array containing our SqlError
objects. The client code will be responsible for checking the length of the array to determine if any errors were generated.
public DataTable Execute(string sqlText, out SqlError[] errorsArray)
{
if (!IsConnected)
throw new InvalidOperationException
("Can not execute Sql query while the connection is closed!");
errors.Clear();
cmd.CommandText = sqlText;
DataTable tbl = new DataTable();
adapter.Fill(tbl);
errorsArray = errors.ToArray();
return tbl;
}
First, we need to tests whether the connection is open or not using the IsConnected
property, and throw an exception if it is not. Next, the errors
list is cleared to prevent reporting errors previously encountered. The query is then executed using adapter.Fill(tbl)
where tbl
is a reference to a new DataTable
object. This table will be filled with data if the T-SQL code returns any data. As mentioned earlier, the InfoMessage
event will be raised synchronously, so the next line after calling Fill
will only be executed after all errors were raised through the event. All errors (if any) are copied to a new array of SqlError
objects. This array is assigned to the out
parameter errorsArray
, allowing the client of our class to check if any errors were encountered. Remember that no exceptions will be thrown when you set FireInfoMessageEventOnUserErrors
to true
.
Parse Method
This method accepts one parameter, sqlText
which contains the T-SQL code to be parsed. It returns an array containing SqlError
objects. The client code should test the length of this array to determine if any errors were generated.
public SqlError[] Parse(string sqlText)
{
if (!IsConnected)
throw new InvalidOperationException
("Can not parse Sql query while the connection is closed!");
errors.Clear();
cmd.CommandText = "SET PARSEONLY ON";
cmd.ExecuteNonQuery();
cmd.CommandText = sqlText;
cmd.ExecuteNonQuery(); //conn_InfoMessage is invoked for every error,
//e.g. 2 times for 2 errors
cmd.CommandText = "SET PARSEONLY OFF";
cmd.ExecuteNonQuery();
return errors.ToArray();
}
Again, we throw an exception if the connection is not open, and we clear the errors list. SQL Server has an option "PARSEONLY
" that we will use to prevent further processing of our T-SQL code beyond the parse phase. Before our sqlText
string is executed, the PARSEONLY
option is set to ON
. Afterwards, it is set back to OFF
. There is a potential pitfall here: what if the client code is a console-type application, and the user executed the command SET PARSEONLY ON
to explicitly prevent further execution beyond the parse phase. When the client code then calls the Parse
method, PARSEONLY
will be set back to OFF
before the method returns, without the user's knowledge. Workarounds for this problem will not be explored further in this article, because the implementation will differ as per requirements of the project.
Additional Notes on the Class
The ConnectionString
property of our SqlHandler
class "forwards" the ConnectionString
property on the SqlConnection
object that it encapsulates. In the constructor, the ConnectionString
is initialized to a "template" connection string. You have to manually insert the Data Source
and Initial Catalog
values in the string. The Connect
method accepts a string
argument containing a connection string. This connection string will replace the existing connection string on the SqlConnection
object.
Sample Application
My sample project contains the SqlHandler
class, and a small test application. The application provides some basic text editor functionality such as opening files, saving files, cut, copy and paste. Furthermore, it implements the SqlHandler
object's methods to enable connecting and disconnecting from a SQL Server instance, and executing and parsing SQL code. The layout of the main window was designed to be familiar looking, with the menu and toolbar at the top, the text area in the middle, and an error grid and status bar at the bottom. When you build and run the application, a Connection dialog window will pop up. On this window, you have to enter a valid connection string to connect to a SQL Server instance. Keep in mind that this application is not multi-threaded. As a result, entering a bad connection string will cause the interface to "hang" while the connection times out and eventually returns with an error message.
I have created a region in the SqlHandler
class for housing custom RoutedUICommand
objects for binding my own commands to the user interface. I put them in their own separate region because they have nothing to do with the rest of the class. These command objects are all static
, and the class also defines a static
constructor for initializing them. These commands could also have been placed in a separate class.
Type your T-SQL text in the text area in the middle of the window. To parse the code, press the Parse button, or press the F6 key on the keyboard. To execute the code, press the Execute button, or press F5 on the keyboard. Both the Parse
and Execute
functions will report errors in the errors grid at the bottom of the application. The errors grid is nested within an expander which will pop up automatically when errors are generated. When you execute a query that returns a result set, a result viewer window will appear. Parsing and executing will be disabled when the application is not connected to a SQL Server instance, as defined by the command bindings.
When you parse a query that references invalid database objects such as tables or columns that does not exist, no errors will be returned. Remember from the Background section that Parsing does not include Binding.
Compliments to the author of the icons set which can be downloaded here for free.
Points of Interest
Visual Studio has some nifty little tools that can make your life easier. One of them is the tool that inserts appropriate code snippets where it is expected by pressing the Tab key. This is useful, for example, when you are registering the InfoMessage
event. Type the following line of code: conn.InfoMessage +=
. You should see a little pop up box...
Press Tab once and it will complete the line for you based on the required delegate for the event. Press Tab again and it will generate the event handler method for you. The event handler will already be set up to contain the correct arguments, all you have to do is add your code.
References
- SQL Server Pro, 23/10/1999, Inside SQL Server: Parse, Compile, and Optimize [online] Available at: http://www.sqlmag.com/article/tsql3/inside-sql-server-parse-compile-and-optimize [Accessed on 20th June, 2012]
History
- 25th June, 2012: Initial version
- 13th January, 2022: Article updated