Azure SQL Database does not support the
CONTAINS is important because it supports searches for precise or fuzzy (less
precise) matches to single words and phrases, words within a certain distance
of one another, or weighted matches.
CONTAINS is a predicate used in the
clause http://technet.microsoft.com/en-us/library/ms188047.aspx of a
Transact-SQL SELECT statement.
there are ways to get this functionality with the help of third party software. I recently downloaded all the movie information for films that took place in
San Francisco. The San Francisco government website provided this interesting
information that I wanted to search through very quickly. The data wasn't very
normalized so there are many columns that are difficult to search. I downloaded
this information (https://data.sfgov.org/Arts-Culture-and-Recreation-/Film-Locations-in-San-Francisco/yitu-d5am) and
imported it into a database. Specifically I used SQL database hosted in the
Windows Azure platform.
goal was to be able to find records in the database table, searching in all
columns and in all rows quickly and efficiently. Essentially, I wanted to build
one master index that would allow me to perform the search locally against the
generated index and then be able to pull up the specific record in the database
table once I found what I was looking for.
is a product that allows you to accomplish this goal. I downloaded it
and installed it. Of course, I use it for many of my other searching needs as
well. dtSearch has a variety of products. This post will focus on how you
can index a database table so that you can perform lightning quick full-text
How To Get Full-Text Search
Capabilities in a Windows Azure SQL Database
this hands-on lab, you will learn:
dtSearch can be used to create a full index over an Azure-hosted SQL Database
- How to
store the index locally on the client for super fast querying with fuzzy logic
- How to
identify and build the appropriate connection string (OLEDB) to enable the
Visual Studio Sample Project application provided by dtSearch
- How to
perform full text searches and find the matching record in SQL Azure
develop this example further, it is possible to extend the sample and retrieve
matching records from dtSearch.
following is required to complete this hands-on lab:
- A Windows
evaluation version of dtSearch (http://www.dtsearch.com)
order to execute the exercises in this hands-on lab you need to set up your
by logging into the Windows Azure Portal (http://manage.windowsazure.com).
the SF Movie data (https://data.sfgov.org/Arts-Culture-and-Recreation-/Film-Locations-in-San-Francisco/yitu-d5am) and load
the data into Windows Azure SQL Database.
Task 1 – Registering the dtSearch
Assembly, Viewing the Windows Azure SQL Database at the Portal
task is about creating a VM where we will store all of our searchable content.
below that I am registering one of the assemblies that gets installed with dtSearch.
Regsvr32 is an essential command to get things started.
post is using Visual Studio 2013 as the IDE. From the menu choose ** File |
Open | Project**.
to the installation folder for dtSearch. (C:\Program Files
(x86)\dtSearch Developer\examples\cs4\ado_demo). Open
the project as seen below.
in the screen below that the database already exists. The database is called SFmovies.
As mentioned previously, the data represents the movie filmed in San Francisco
over the last several decades.
the database at the Windows Azure portal
2– Getting the connection string information
this next section we will get the connection string information. Unfortunately,
Windows Azure portal does not provide the exact format of the connection string
that is needed.
that in the lower right corner of the figure below you can view the connection
information for your database. This information is necessary so that the Visual
Studio project can connect to the database and index the information inside of
screen below is the connection string needed to connect to the database.
dtSearch requires that this connection string uses a different format
called, OLEDB. We will take the information in the ADO.NET connection string
and modify it to be conformant to OLEDB.
the ADO.NET connection string
in the figure below the connection string has been altered to conform to the
format required by dtSearch. This is the format required by OLEDB.
reformatted connection string
connection string needs to be one continuous line, as seen below. Your
connection string will obviously differ, based on the database you are hosting
in the Windows Azure cloud.
3 – SQL Server Management Studio
will use SQL Server Management Studio to view the database that we wish to
order to test connectivity and view the actual data structures, we will start
SQL Server Management Studio. From the Object Explorer window, click database
Server Management Studio
same information that was used for the connection string will work here within
SQL Server Management Studio.
to the database
seen below, choose to create a query that will display the records from the
the movies table in the SFmovies database
in the figure below all the information for the movies is displayed within SQL
Server Management Studio.
the movies data
that we have the connection string, we are able to start executing the demo
(ADONETDemo). You can download this project here: (C:\Program Files
(x86)\dtSearch Developer\examples\cs4\ado_demo\ADONETDemo.sln).From the Build
menu you can Build Solution.
the dtSearch project
the project is running, you will see a form pop-up. Click the button that reads
Index a Database.
indexing the database
the index directory in the figure below. Once the database is indexed,
the files that represent the index will be stored in the folder below. Click on
the Index button below.
this point the application will request your OLEDB connection string.
Paste into the box below. Then click OK.
the connection string
this point, it might take a few moments for the indexing to complete. The lower
red box represents the progress being made during the indexing process.
the folder indicated previously, you can see a list of index files that were
generated during this process.
the generated indexes
that the indexing process is complete, we are ready to run the application and
perform a search. Click on the search command button below.
for embedded strings in the database
you type in the search string, you will see the successfully found strings
on the list box on the left. Click on the search button to actually dig
deeper into the search results.
a search string
that all the results are presented in the search results screen. Notice
in the lower left that we have the movie row ID, which uniquely
identifies the record in the database. We cannot return to SQL Server
Management Studio to view this record.
Note: In a real
app you could extract the row id and perform another lookup.
the details of the search
that if we do navigate to movie row ID of 301, we can see that the Golden
Gate Bridge is visible in the SQL Server Management Studio results screen.
This post has now come full circle.
the search results
able to find content quickly in a SQL Database is key to being productive.
Building upon the example in this post, you should be able to create
applications that are capable of lightning quick full-text searches.
Beyond what is described here, the dtSearch Engine can also support many other
advanced options, including: faceted search, fields-based positive and
negative term weighting, and searching the full-text of BLOB data in an SQL
database with highlighted hits.
see Databases and Field Searching at http://support.dtsearch.com/faq/bysubject.htm for more
information on these advanced search options. Please also see http://www.dtsearch.com/PLF_Features_2.html for a
general overview of dtSearch full-text search options. For an overview of
what you can do generally programming with the dtSearch Engine, including
supported data types for BLOB data, etc., please see http://www.dtsearch.com/PLF_engine_2.html and http://www.dtsearch.com/PLF_DocFilters.html
Starting with Turbo C and 8086 Assembler in the late 80s, Bruno has kept busy teaching and writing code in a multitude of platforms, languages, frameworks, SDKs, libraries, and APIs.
Bruno's depth of knowledge comes from years of experience in the field, where he can bring real-world knowledge and combine it with forward thinking that is required for his current role as a Developer Evangelist at Microsoft. Prior to evangelism, Bruno was a Microsoft Premier Field Engineer, helping customers in remote locations on a moment's notice to help with extreme troubleshooting scenarios, including problem isolation and correction, live and post-mortem debugging, on-the-fly application design and code reviews, performance tuning (IIS, SQL Server, .NET), application stability, porting / migration assistance, configuration management, pre-rollout testing and general development consulting.
As an evangelist, Bruno spends time writing code and giving live presentations on building cloud based applications, specifically using the Windows Azure Platform. He also takes a strong interest in Mobile Computing and is convinced that both mobile and cloud platforms, separately and together, are poised for huge growth over the next 10 years.
Bruno is very optimistic about the potential for new interactions with software using Kinect. The software industry has been languishing in aging approaches to human interactions and software. Kinect opens up an brave new world of possibilities to next generation software engineering.