Visit http://www.nata1.com/ to contribute to the project. We are currently looking for many developer roles. Nata1 is the most powerful .NET Search Engine solution on the market, and its free for non-commercial use only. Nata1 allows you to index large sites, perform powerful queries, and configure advanced normalization a search customization features. Nata1 includes dozens upon dozens of advanced server controls that allow you to drag and drop an advanced search solution in minutes. Nata1 allows you to switch between Google, Nata1, and Index server without writing any custom code. This first article will demonstrate using Nata1 Asp.NET controls, but the next article will discuss adding searching capabilities to the TaskVision application, where we will build a site health monitor, check site ranking in Google, and create tasks accordingly.
Using Microsoft index server to develop a site search engine 4 years back was a lot of fun. At the time, the flexibility to control noise words and what part of a web site gets indexed was interesting.
Imagine you wanted to find “all good surfing places“ in Costa Rica - but the site copy uses surf, not surfing, and words like all, and places are also not in the copy and not relevant. You can use index server, but its much better to have your own control. Imagine you needed to collect info on what people are searching for on your site, or you want to weight pages, exclude directories, etc.
When I got my first experience with .NET during Beta II of Visual Studio, the possibilities jumped out at me and I began working on Nata1.
Using Nata1, you can drag and drop UI search engine components like hit results, relevance, etc. and switch between google, Nata1, or Index Server without writing any custom code.
This article will go over the basics needed to build a basic search page, here is an example. http://www.nata1.com/Photos/Project+Photos/324.aspx
For the developer that is more interested in customizing the controls and developing more advanced features, this is a good starting point. Also, computer science students studying Algorithms and data structures can get a good grasp on Binary Search Trees, and implement their own data structures. A series of articles written by Scott Mitchell are an excellent starting point for the computer science student to understand and analyze the differences between different data structures like skipped lists, and the properties of balanced trees. Nata1 was first implemented using BST's on a remote machine, and although you can use SQL server, you can also make a highbred with little work to use SQL server and a BST.
This article will show you how to get up and running with Nata1, but future articles by myself and others will demonstrate developing core search engine components and controls.
Step 1: add some configuration code to the web.config file. Config will be taken from the database, but if its not set in the admin tool it will look to the web.config, and then use defaults if nothing is found. For your search engine, using the web.config is fine, but I've found storing this info in the database is preferable, and there are many admin controls included that allow you to alter everything from normalization rules, to spidering settings.
<add key="filePath" value="C:\siteName\searchEngine\" />
<add key="site" value=siteUrlHere if your indexing just one site />
<add key="defaultPage" value="index.aspx" />
<add key="filePath" value="c:\eventLog.txt" />
<add key="connectionString" value="cn string stuff here" />
<add key="hour" value="4" />
<add key="intervalType" value="daily" />
and if you want to publish exceptions, use this
You'll need to add this as well
<section name="binPath" type="Nata1.Nata1SectionHandler,Nata1" />
<section name="sites" type="Nata1.Nata1SectionHandler,Nata1" />
<section name="log" type="Nata1.Nata1SectionHandler, Nata1" />
<section name="database" type="Nata1.Nata1SectionHandler, Nata1" />
<section name="preferedIndexTime" type="Nata1.Nata1SectionHandler, Nata1" />
<section name="indexRequestTimeOut" type="Nata1.Nata1SectionHandler, Nata1" />
<section name="indexing" type="Nata1.Nata1SectionHandler, Nata1" />
<section name="indexService" type="Nata1.Nata1SectionHandler, Nata1" />
<section name="google" type="Nata1.Nata1SectionHandler, Nata1" />
Step 2. next, run the DataBase install scripts for SQL Server. the Database isn't very complex so you can easily use MySQL. If you don't have a database, you can still use an in memory Binary Search tree, but this isn't recommended, you always want to remote your data structures.
Step 3. Add Nata1.dll to your toolbox. Right click your toolbox. Choose “add/remove items” , click browse, and find Nata1.dll. Nata1 controls are now added to your toolbox.
there are dozens of controls, some are container controls, like
ResultsRepeater, and other are for individual Items, all the ones with a smiley icon are placed in the Item or Alternating Item template, like
HitWords, etc. You can get creative with your toolbox icons, I've included some neat ones like Homestar runner icons. Controls like
QueryTime sit in the header template. Some controls are specific to a search provider, e.g. Google has many controls, like spelling suggestions, but index server only has a couple so you have to be careful to make sure the provider supports the controls.
Step 4: We'll need a form to get from a search box to the search results page. Go ahead and drag and drop “SearchForm“ (control with the ducky) onto any ascx or aspx page in your site.
To use an image, set the SearchButtonText to an image Url (I know, not the most elegant) or enter text and make sure to set the ButtonType as well as SearchPageUrl. As you can see, there is a bug in the designer as the image isn't updating.
Step 5: We'll build the search results page. Drag and drop “Search Results Repeater” (the one with the fairy icon) onto a ascx or aspx page
The two most import properties will be “Query Provider“ - here you want to select Google, Nata1, Index Server, Rss, or ASP.NET Forums. The last two are still in development, anyone want to develop them, be my guest.
The other property is called SearchQueryTemplate mode, here you want to select simple or advanced.
Step 6: Right click the template, choose the template you want to edit, and start dragging and dropping controls.
Here I dragged the controls
TotalHits onto the Header template, and put an ad banner there too, you can rotate based on keyword if you want.
There are several other templates you'll need to set, like
NoResults, etc. There's also a template for a Search Form, and you can specify what search form controls to place there, perhaps you want an advanced search form to be at the top.
Step 7: make sure you place this code in your Global.asax! When you restart you web app (I usually just add one space to the web.config) your app will restart, and Nata1 will begin indexing, and follow the index plan you have specified in the web.config or in the database.
Sub Application_Start(ByVal sender As Object, ByVal e As EventArgs)
There are numerous controls for administration if you want those as well. You can manually index your site, and you can manage noise words, see all the words on the site, and manage normalization (what words to normalize or not and also special rules, i.e. running, ran, and run are the same word.)
One import control that is left as an exercise is logging search words - i.e. what are people searching for on the site? How about some info about them?
Here you have a powerful search engine you can put together in minutes, but the future of Nata1 is up to the community: I would like to see a DNN implementation, a CSK implementation, and I am currently working on a TaskVision implementation.
Hope you enjoyed my article, and let me know if you have any problems downloading the code or have any comments on the article. We are looking for contributors, so if you want to write new data structures, new controls, new providers, integrate with DNN, CSK, IBuySpy, or have other ideas, we'd love to hear them! Sedgewick@Nata1.com