Databases came to the world in 70. Since that, they became bigger in size, with more complicated logic and still keep growing. So, there are many tools, SQL parsers, which tokenize by-object script.
But common parsers (ANTLR, e.g.) may face with script parsing errors, each SQL dialect has its own specials, and time issues on huge number of DB objects.
So, I want to show much simpler, a RegEx+some trick way.
- Get one-file SQL script of whole db objects.
- Split SQL script onto text blocks by using regex that match DDL statements
- Search some text string among these text blocks
1. Take All SQL Objects Script
Each RDBMS has an option to generate SQL drop/create scripts. So, first we get single SQL script of all database objects. Or we can take northwind.sql as an example.
2. Get all DDL Statements with Regular Expression
I use this expression:
FastColoredTextBox gives the great tools to work with regular expressions, so we use:
var result = range.GetRangesByLines(regexStr, RegexOptions.IgnoreCase);
First of all, I built some
useful regex and collect them in one
public static class RegexValues
public static string SqlCmdObjects = @"\[\$\(([^=<>\[\]\s\']+)\)\].\[[^=<>\s\']+\]";
public static string SqlCmdObjectsShort = @"\$\(([^=<>\[\]\s\']+)\)";
public static string DdlObjects = @"\b(create)\s+(procedure|proc|table|trigger|
public static string DdlObjectsPrepared = @"\b(create)\s+(procedure|proc|table|
public static string DdlObjects_ = @"\b(create)\s+(procedure|proc|table|
public static string DdlObjectsPrepared_ = @"\b(create)\s+(procedure|proc|table|
public static string DdlObjectsPreparedProcViewTrigger = @"\b(create)\s+
public static string DdlObjectsPreparedWithIndex = @"\b(create)\s+(unique)*\s*
public static string DdlIndexAll = @"\b(create|alter)\s+(procedure|proc|table|
public static string Variables = @"\@([^=<>\s\'\)\(\,]+)";
public static string SqlCmdVariables = @"\:SETVAR\s+([a-zA-Z_]+)\s+([a-zA-Z_]+)";
SqlCmdObjects - matches for
SqlCmdObjectsShort - matches for
DdlObjects - same as
SqlCmdObjects plus create
DdlObjectsPrepared - same as
DdlObjects, but removed
sqlcmd variable) restriction
DdlObjects_ - same as
DdlObjects, but removed enclosing
DdlObjectsPrepared_ - same as
DdlObjectsPrepared, but removed enclosing
DdlObjectsPreparedProcViewTrigger - DDL statements for only proc, views, triggers and functions, i.e., only code objects
DdlObjectsPreparedWithIndex - same as
DdlObjectsPrepared_, but extended with
DdlIndexAll - same as
DdlObjectsPrepared_, but extended with
Variables - regex that finds all
@variables in script
SqlCmdVariables - finds SQL CMD variables (like
:SETVAR var1 val1)
The heart of this software is the following lines of code:
var range = Syncronized(() => fastColoredTextBox1.Selection.Clone());
range.Start = new Place(0, 0);
range.End = Syncronized(() => new Place(tb.GetLineLength(tb.LinesCount - 1),
tb.LinesCount - 1));
So we just load SQL script into FastColoredTextBox (thanks alot to its authors!!!) and apply some regex to all its contents. As the output, we got
RESULT variable, which contains a list of found ranges.
Range is the powerful class (thanks again to
Range contains line number and column number of found fragment in SQL script. We just store list of ranges in virtual
listbox (common technique) and on
SelectedIndexChanged do the following:
fastColoredTextBox1.Selection = range;
by these two lines, we instantly navigate to found piece of code (i.e., DDL statement).
Then we build another list of ranges, but instead of regex string, we put the name of some object to be found.
The last step is to find occurrences of second list of ranges in the first list of ranges. This is done in method:
private void FindUsage(string regex)
That's all, the rest of the work is just common .NET coding to build up all together.
As a brief to the above - the logic is as follows:
- We find all
alter statements by using regex. And remember those line numbers.
- We find all occurrences of text (object, e.g., table name) by using regex. And also remember line numbers.
- Now join these two datasets, finding, where the text line is in between
alter statements lines
As a result, we got a simple GUI designer, where you can:
- open a SQL script or whole folder (scripts will be merged into temp file and opened as single script)
- Search all DDL statements (
ALTER) - list will be build on the left pane, with full navigation support.
- select some piece of text by mouse (e.g., some SQL object name)
- right click and select "Find Usages" - list will be build on the right pane - these are SQL objects, containing selected text
Another good thing is that you can search not only SQL objects occurrence in SQL objects, but almost any piece of code, text, comment, etc.
As this way of analyzing SQL dependencies is based not on metadata info, you should always keep in mind that you're joining two datasets of rows/ So there are some limitations, or, better say, "features" :)
Let's say we have a stored procedure:
create proc test
declare @somevar int
create table #tmp(
If we apply regex matching "
CREATE TABLE" statement, we match
temp table along with
Then, if we try to search "
COLUMN_WE_SEARCH" - it will be found inside that
#tmp, not in procedure test.
This can be worked around with more precise initial regex. For such cases, I wrote
DdlObjectsPreparedProcViewTrigger regex statement (see the above).
Points of Interest
I want to add MSAGL https://github.com/Microsoft/automatic-graph-layout support to visualize dependencies.
FastColoredTextBox - is the high-end, the total breakthrough! I didn't even expect that it would have so many cool features!
- 10th January, 2018: Initial version