Introduction
The OmniSearch project was founded around the idea of designing a customizable
search engine for seamless integration into any current website that run the
ASP.NET Framework.
OmniSearch can accomplish this by caching the results from the Google API and
displaying the results as a manipulated XML file using XSL.
This will allow for flexibility and ease of programming for the Administrator.
OmniSearch will use the Google API as the backbone to the
OmniSearch Engine, which will do the bulk of the processing for the
application. OmniSearch’s fully
query-able statistics engine will collect click-through habits, from the users,
for later analysis by the Administrator of the site. The
OmniSearch engine will serve a variety of functions including, but not limited
to:
-
Maintaining a statistical record of click-throughs.
-
Caching search results from Google’s XML SOAP API.
-
Providing a Wireless WAP interface for mobile users to take advantage of
OmniSearch.
-
Provide components to integrate into existing ASP.Net projects that webmasters
may be running.
Background
This project’s main goal is to get around the limitations that
the Google API forces on its users, which only allows 1000 hits per day. Our
team plans to get around this limitation by caching the serialized results from
the Google API into an XML file. Subsequent
web searches will first query the XML Cache to find previous search results
that will fulfill the current query of the current user.
By utilizing the cache, OmniSearch will become quicker and better able to handle
user search requests, the more that it is being used by the sites users.
One of the hardest things about being a website administrator
is collecting useful information about users habits on your site. So
to further challenge the OmniSearch team we decided to add a click-through
logging feature that will enable administrators to view pages that may be
getting high traffic load. This
will also aid in decreasing the time needed to find certain pages, because the
administrator will be able to tell the users such things as “The Site of the
Day”, “Top 5 Sites for this Month”, and so on.
In Addition to the cached search results and statistics
database, OmniSearch will also support WAP interfaces for the search engine. By
using ASP.Net Mobile Architecture, the OmniSearch team is able to provide a
wireless interface that will customize it self according to the wireless
browser that is being used.
Using the code
The following parameters need to be setup in the web.config file, before
anything will work correctly.
<appSettings>
<add key="CacheLocation" value="\cache" />
<add key="DirectoryName" value="\OmniSearch" />
<add key="WebSiteLocation" value="D:\Websites\Development\OmniSearch" />
<add key="ProxyHost" value="" />
<add key="ProxyPort" value="" />
<add key="ProxyUsername" value="" />
<add key="ProxyPassword" value="" />
<add key="GoogleKey" value="" />
<add key="GoogleMaxReturn" value="10" />
<add key="GoogleFilter" value="False" />
<add key="GoogleRestrict" value="" />
<add key="GoogleSafeSearch" value="False" />
<add key="DomainToSearch" value="ist.psu.edu" />
<add key="ExpireTime" value="5" />
</appSettings>
The following is an example of the XSL file that can be used against the
serialized form of the XML Cached Search from Google's Web Service. This XSL
file displays the search contents just like you would see on the Google Search
Page. In the source code this file is called searchresults.xsl.
="1.0"="utf-8"
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" version="4.0" encoding="iso-8859-1" indent="yes" />
<xsl:template match="/">
<xsl:apply-templates select="GoogleSearchResult" />
</xsl:template>
<xsl:template match="GoogleSearchResult">
<table width="100%" border="0" cellspacing="0" cellpadding="0" bgcolor="#6699cc"
style="padding: 5px, 5px, 5px, 5px; margin-top: 5px;" ID="Table1">
<tr style="color:white;">
<td aling="left">Search <b>ist.psu.edu</b> for <b>
<xsl:value-of select="searchQuery" />
</b></td>
<td align="right">Results <b>
<xsl:value-of select="startIndex" />
</b> - <b>
<xsl:value-of select="endIndex" />
</b> of about <b>
<xsl:value-of select="estimatedTotalResultsCount" />
</b>.</td>
</tr>
</table>
<table border="0" cellspacing="0" cellpadding="0" ID="Table2">
<xsl:for-each select="resultElements/ResultElement">
<tr>
<td>
<b>
<a style="color:blue;" href="/OmniSearch/forward.aspx?url={URL}">
<xsl:value-of select="title" />
</a>
</b>
</td>
</tr>
<tr>
<td>
<xsl:value-of select="snippet" />
</td>
</tr>
<tr>
<td>
<span style="color:green;"><xsl:value-of select="URL" />
- <xsl:value-of select="cachedSize" /></span>
<a style="color:gray;"
href="http://216.239.53.100/search?q=cache%3A{URL}+{searchQuery}">
Cached</a>
<a style="color:gray;"
href="http://www.google.com/search?q=related%3A{URL}">
Similar pages</a>
</td>
</tr>
<tr>
<td>
<br />
</td>
</tr>
</xsl:for-each>
</table>
<table width="100%" border="0" cellspacing="0" cellpadding="0" bgcolor="#6699cc"
style="padding: 5px, 5px, 5px, 5px; margin-top: 5px;" ID="Table3">
<tr style="color:white;">
<td align="left">
<b>
<a style="color:white;"
href="/OmniSearch/search.aspx?search={searchQuery}&back={startIndex}">
« Back</a>
</b>
</td>
<td align="right">
<b>
<a style="color:white;"
href="/OmniSearch/search.aspx?search={searchQuery}&forward={endIndex}">
Next »</a>
</b>
</td>
</tr>
</table>
</xsl:template>
</xsl:stylesheet>
The following code is all that is needed inorder to impliment the OmniSearch
Engine in any ASP.Net webpage. It took alot of time but I wanted to make
it simple to use, but really powerful at the same time.
protected System.Web.UI.WebControls.Xml resultsXml;
private void Page_Load(object sender, System.EventArgs e)
{
search = new SiteSearch(Request.QueryString["search"]);
if (Request.QueryString["forward"] == null && Request.QueryString["back"] == null)
start = 1;
else if (Request.QueryString["forward"] != null && Request.QueryString["back"] == null)
start = Convert.ToInt32(Request.QueryString["forward"]) +1;
else if (Request.QueryString["forward"] == null && Request.QueryString["back"] != null)
start = Convert.ToInt32(Request.QueryString["back"]) -
Convert.ToInt32(ConfigurationSettings.AppSettings["GoogleMaxReturn"]);
else
start = 1;
if(start < 1)
start = 1;
resultsXml.DocumentSource = search.DoSearch(start);
}
This is an example of what the ASP.Net page can look like when you put it all
together. The following line of code is all you really need to impliment the
OmniSearch Engine in any ASP.Net application that you would choose.
<asp:xml id="resultsXml" runat="server" TransformSource="searchresults.xsl"></asp:xml>
Points of Interest
I learned a couple major things while implimenting this code, they are listed as
follows:
-
Googles Web Service is very powerful but has many limitations beyond the 1000
hits per day.
-
The use of XML Serialization was very useful to the completation of this
project.
-
XSL and how it can be used in .Net even if the client browser doesn't support
XSL.
- In order to get around some obsticals in naming the XML Cache files I had to use a MD5 value of the search query plus an indexing number on the end of the MD5 for the specific pages. The file will be in this format [search query (MD5 Value)].[page index].xml when it is written to the /cache directory.
One thing that needs to be done before OmniSearch can be used.
- The caching directory /cache needs to grant full access to [computer name]\ASPNET. This needs to be done so that OmniSearch can write the XML Cache files to the directory.
I will be eventually integrating this Engine into the OmniPortal Project that I
am also developing. More about OmniPortal can be found at
http://www.sourceforge.net/projects/omniportal or
http://beta.omniportal.net (Shameless Plug).
History
2/10/2003 - First Release.
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.