Click here to Skip to main content
Click here to Skip to main content
Go to top

Very Powerful, Generic, Irony Based, Database Searcher

, 26 Aug 2010
Rate this:
Please Sign up or sign in to vote.
Usage of Irony to produce a Google-like search tool on any column in a database

Introduction

In this article: Google like Full Text Search, I had the joy of experimenting with the implementation of a Google like FTS Search engine and was wondering if I could get something more general to work. In that quest, I downloaded and tested the Irony project from http://irony.codeplex.com/.

And I began playing around with it… a lot. There are still some documentation pieces missing but the general idea is really simple and I suggest you test it yourself and try doing some grammars (it’s always useful to remember those Compiler days in the university).

To generalize the idea, I talked with Ivan and converged that the new searcher should be able to search in any kind of database configuration with any type of information. This should allow the user to configure a little library (the one that I will describe here) and provide powerful functionality as the one shown in these examples:

  1. This example’s configuration is used throughout the article. It allows searching articles in a database filtering by id, code, description and price. In the example, we are asking the articles with id between 100003 and 100009 and with description containing MAGGI.

  2. In this example, I have downloaded a part of the IMDB database and queried it to know which actors were born on a 15th, June.

Grammars for Different Types

What I did was to simplify the problem to 4 types of information that cover all of the possibilities for a SQL Server column:

  • Numeric: All integers, doubles, floats go in this category
  • Date: All date, datetime, time types
  • String: All type of varchars
  • FTS: All those varchars and text that have a full-text index defined

For each of these columns, I defined a grammar with similar operations between them. Some examples are given here:

Type Description Example
NUM All data bigger than 3 and smaller than 10 >3 and <10
4:9
NUM Different than 5 or (smaller than 3 and bigger than 0) !5 or (<3 and >0)
STR Like 'vladimir' vladimir
STR Begins with ‘start’ and ends with ‘end’ start* and *end
start* *end
DAT Later than 01/01/1980 (Dates are always in dd/mm/yyyy format) >01/01/1980
DAT In august, 9th. Any year d(08) m(08)
DAT Between the 01/01/1990 and 31/01/1990 01/01/1990:31/01/1990
FTS Inflectional forms of ‘lived’ and inflectional forms of ‘in’ and inflectional forms of ‘germany’ Lived in germany
FTS Contains ‘lived in germany’ "Lived in germany"

You are welcome to browse each of the grammars defined for this library. They contain the operations needed for all combinations of the above (as long as the column type is coherent with the operation).

Now that we have the required grammars we need to know how to interpret them. For example, the numeric grammar should interpret to a SQL query that conforms to the requested expression.

The library contains the interpretation for all grammars to SQL like expressions. Please note that each of the interpreters contain an association to the column that is going to be filtered with this expression. In the following lines, we will explore the structure that defines the search to be performed.

All these interpreters were inspired by the excellent FTS interpreter given in the link above and so I leave for you the analysis of these classes.

Field Grammar & Search Structure

Up until now, all we have is a way to interpret an expression (in any of the 4 defined types) and form a SQL expression with it. In other words, all we have is the WHERE clause for each of the columns that participate in the search.

Now, we will define a grammar to define the columns and the search expressions for each of the columns. Let’s say we have 2 searchable columns: bio and name. We would like to have a grammar that understands the following:

bio=lived in germany & name=Jurgen

We could parse this expression, and interpret each of the columns with the corresponding interpreter. Finally, we would obtain a nicely formed SQL expression that would do the search for us. The grammar defined to deal with this is:

Expression =>
	ColumnSearch |
	Expression QuerySearchOp ColumnSearch
		
ColumnSearch =>
	Name EqualOp Term
		
QuerySearchOp =>
	AndOp | 
	OrOp
	
EqualOp =>
	"="

AndOp =>
	"&"

OrOp =>
	"|"

The class defining this grammar gives the correct priorities to our operators and we quickly have a simple grammar to deal with the desired expression. However, the grammar needs to define only one search criteria per line. So, the search component adds a new line before each search term.

The search structure that works with this library contains the following:

Class Diagram Search Structure

A search structure contains a collection of tables. Each table contains a join (except for the first table) and a collection of columns. A column contains:

  • Name of the column: This is the name of the column in the database
  • Keyword: The keyword that the grammar will recognize as valid to refer to this column
  • Searchable: Whether this column is searchable or not
  • Title: The title of the column in the resulting dataset
  • Appear in results: Whether the column appears in the resulting dataset or not
  • Type: The column type defines the grammar that will be used to interpret the search term for the column
  • Appear in Standard Search: This will be discussed in the final subtitles of this article.

Forming the SQL Expression

We are ready to pass a human like query to our interpreter and this should validate the expression and interpret each part of it to produce a valid SQL expression conforming to the given query. Specifically, the intervening classes are:

All we have to do is call the GetQuery() method of a SqlServerInterpreter object to obtain the requested SQL query. The constructor of the SqlServerInterpreter will ask for a structure that defines the columns and tables to search.

This solution is needed many times in a system, for different tables and different columns. This is why we should have many configurations to work with our library. Also, we need the configurations to be read from a database or defined directly on the client. The library provides the ability to collect, in a singleton hash table, the different configurations of the search structures.

Also, the resulting object is very important. The SearchResult class defines an object that contains the resulting query for the query expression and the resulting DataSet after performing the search. The class diagram for the Searcher and the SearchResult is the following:

Searching

The use of this structure is very simple. Here we have an example of a search to be performed in two tables: tbl_in_articulos and tbl_fa_precios. The join between these two tables is when tbl_in_articulos.id = tbl_fa_precios.id_articulo. The resulting DataSet should contain:

  • id: numeric and searchable and keyword=id
  • codigo_articulo: string and searchable and keyword=cod
  • descripcion: string and searchable and keyword=desc
  • precio: numeric and searchable and keyword=pre

The constructor expects a SearchStructure and so we subclass this class, define the columns and tables in the constructor of this subclass, and add them to the structure with the given methods.

  1  public class ArticlesSearch : SearchStructure
  2  {
  3      public ArticlesSearch()
  4      {
  5          TableStructure basetbl = new TableStructure("tbl_in_articulos");
  6  
  7          Column col = new Column("id", "id", "id", 
  8                           Column.ColumnType.Numeric, basetbl.Name);
  9          basetbl.addColumn(col);
 10          col = new Column("codigo_articulo", "Codigo", "cod", 
 11                           Column.ColumnType.String, basetbl.Name);
 12          basetbl.addColumn(col);
 13          col = new Column("descripcion", "Descripcion", "desc", 
 14                           Column.ColumnType.String, basetbl.Name);
 15          basetbl.addColumn(col);
 16  
 17          this.addTable(basetbl, null);
 18  
 19          TableStructure tbl = new TableStructure("tbl_fa_precios");
 20  
 21          col = new Column("id_articulo", "id_articulo", "none", 
 22                           Column.ColumnType.Numeric, tbl.Name);
 23          col.Searchable = false;
 24          col.AppearInResults = false;
 25          tbl.addColumn(col);
 26          col = new Column("precio", "Precio", "pre", 
 27                           Column.ColumnType.Numeric, tbl.Name);
 28          tbl.addColumn(col);
 29  
 30          JoinStatement join = new JoinStatement(basetbl.Name + ".id" , 
 31                                                 tbl.Name + ".id_articulo", 
 32                                                 JoinStatement.JoinType.Inner);
 33          this.addTable(tbl, join);
 34      }
 35  }

Now, all we need to do is instantiate an object of this structure and use it with our searcher:

  1  string query = txt.Text;
  2  Searcher srch = new Searcher(new ArticlesSearch());
  3  srch.ConnectionString = 
  4  ConfigurationManager.ConnectionStrings["SearchConnectionString"].ConnectionString;
  5  SearchResult res = srch.search(query);
  6  try
  7  {
  8      lbl.Text = res.Message.Replace("\n","
") + "
" + res.GeneratedQuery;
  9  
 10      DataView dtView = new DataView(res.Data.Tables[0]);
 11      gdv.DataSource = dtView;
 12      gdv.DataBind();
 13  } catch { ; }

The Searcher object is instantiated passing an ArticlesSearch object as a parameter. This defines the grammar to be used to parse the expressions. Next, we can define the connection string to be used by the searcher (the connection string used by default is ‘SearchConnectionString’). Finally, we search passing the text from the text box in our client.

The result is read as a DataSet in the data field of the SearchResult object returned. This can be passed to a gridview and we can have a very powerful searcher in seconds.

Another way to obtain the same result is to define the structure directly in the database. The library expects the following structure:

This structure can easily be loaded with the scripts distributed with the library below. The only thing here is that you should insert the records directly with SQL instructions. A complete example of a configuration equivalent to the one seen programmatically will be the following:

  1  insert into SearchConfig
  2  values (newid(), 'test')
  3  /* inserted record with guid = DD9DD3DF-5340-4461-86CA-B3A7A589F404 */
  4  
  5  insert into SearchTable
  6  values
  7  (newid(), 'tbl_in_articulos', 0, 'DD9DD3DF-5340-4461-86CA-B3A7A589F404')
  8  
  9  insert into SearchTable
 10  values
 11  (newid(), 'tbl_fa_precios', 1, 'DD9DD3DF-5340-4461-86CA-B3A7A589F404')
 12  
 13  select * from SearchJoin
 14  /* inserted records with guid = 
 15     FBDDB7EB-66D1-425E-A22F-5DEBC16662DC 
 16     F84F27CD-D40A-473F-9A5E-B67346E641AA
 17     respectively */
 18  
 19  insert into SearchJoin
 20  values
 21  (newid(), 'F84F27CD-D40A-473F-9A5E-B67346E641AA', 
 22    'tbl_in_articulos.id', 'tbl_fa_precios.id_articulo', 'INNER')
 23    
 24  /** The columns */
 25  insert into SearchColumn
 26  values
 27  (newid(), 'FBDDB7EB-66D1-425E-A22F-5DEBC16662DC', 'id'
 28   'Id', 'NUM', 'id', 1, 1, 0)
 29   
 30  insert into SearchColumn
 31  values
 32  (newid(), 'FBDDB7EB-66D1-425E-A22F-5DEBC16662DC', 'codigo_articulo'
 33   'Codigo', 'STR', 'cod', 1, 1, 0)
 34   
 35  insert into SearchColumn
 36  values
 37  (newid(), 'FBDDB7EB-66D1-425E-A22F-5DEBC16662DC', 'descripcion'
 38   'Descripcion', 'STR', 'desc', 1, 1, 0)
 39  
 40  /** This is needed for the join */
 41  insert into SearchColumn
 42  values
 43  (newid(), 'F84F27CD-D40A-473F-9A5E-B67346E641AA', 'id_articulo'
 44   'ID_Articulo', 'NUM', 'none', 0, 0, 0)
 45   
 46  insert into SearchColumn
 47  values
 48  (newid(), 'F84F27CD-D40A-473F-9A5E-B67346E641AA', 'precio'
 49   'Precio', 'NUM', 'pre', 1, 1, 0)

In this case, to use the Searcher, all we need is the following lines:

  1  Searcher srch = new Searcher("test");
  2  srch.ConnectionString = 
  3  ConfigurationManager.ConnectionStrings["SearchConnectionString"].ConnectionString;
  4  SearchResult res = srch.search(query);

Final Remarks

I have tried to provide the most functionality to the searcher. I currently use it with no problems for most intranet applications.

It sure needs some extra work to publish it to the world in a searcher in internet. The most critical issue regarding an internet publication would be the capability to suggest the possible term to expect. This should easily be done by Irony and I expect to include this feature in the library in the short term.

Another pending task in my to-do list is the ability to apply the search term to all columns. For example, when we just type: ‘lived in Germany’; we would like it to apply to all columns that are searchable. This is why the Column structure has a StandardSearch flag.

Although the less than and bigger than operators are straight forward there are some issues when using it in an ASP.NET page because of the validation; once that is disabled everything works fine.

Conclusion

A generic searcher is possible combining different Human-like-query-to-SQL interpreters for different grammars. In this article, we have seen how this can be done for 4 different types of data that cover most of the possibilities in a SQL Server.

The library provides a very simple interface to incorporate powerful search capabilities to any table(s) in your database. The view capability (order by, filtering, etc.) is left to be worked in the client. The only task the searcher does is exactly that: search and return a data set.

History

  • 24th August, 2010: First release

License

This article, along with any associated source code and files, is licensed under The Microsoft Public License (Ms-PL)

Share

About the Author

Vladimir Calderon
Chief Technology Officer Artexacta
Bolivia Bolivia
just a coder. Love Java, but working on asp.net
Follow on   Twitter   Google+

Comments and Discussions

 
GeneralMy vote of 5 PinmemberMember 443298625-Feb-12 5:23 
Fantastic idea for any kind of shell command interpreter creation.
GeneralMy vote of 1 Pinmemberwebooth3-Sep-10 5:53 
GeneralMy vote of 5 Pinmembertrhalvorson31-Aug-10 4:56 
GeneralMy vote of 3 Pinmemberkarabax25-Aug-10 5:59 
GeneralLooks good... PinmentorSandeep Mewara24-Aug-10 10:26 
GeneralRe: Looks good... PinmemberVladimircy24-Aug-10 12:29 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140916.1 | Last Updated 26 Aug 2010
Article Copyright 2010 by Vladimir Calderon
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid