![]() |
Web Development »
Caching »
General
Intermediate
OmniSearch (Google Caching Engine)By Nick BerardiOmniSearch attempts at showing a way to reduce the number of hits to the Google Web Service by caching searches. |
C#.NET 1.0, Win2K, WinXP, ASP.NET, Dev
|
|
Advanced Search |
|
|
|
||||||||||||||||
The OmniSearch project was founded around the idea of designing a customizable search engine for seamless integration into any current website that runs the ASP.NET framework. OmniSearch can accomplish this by caching the results from the Google API and displaying the results as a manipulated XML file using XSL. This will allow for flexibility and ease of programming for the administrator.
OmniSearch will use the Google API as the backbone to the OmniSearch engine, which will do the bulk of the processing for the application. OmniSearch�s fully query-able statistics engine will collect click-through habits, from the users, for later analysis by the administrator of the site. The OmniSearch engine will serve a variety of functions including, but not limited to:
This project�s main goal is to get around the limitations that the Google API forces on its users, which only allows 1000 hits per day. Our team plans to get around this limitation by caching the serialized results from the Google API into an XML file. Subsequent web searches will first query the XML cache to find previous search results that will fulfill the current query of the current user. By utilizing the cache, OmniSearch will become quicker and better able to handle user search requests, the more that it is being used by the site's users.
One of the hardest things about being a website administrator is collecting useful information about users habits on your site. So to further challenge the OmniSearch team, we decided to add a click-through logging feature, that will enable administrators to view pages that may be getting high traffic load. This will also aid in decreasing the time needed to find certain pages, because the administrator will be able to tell the users such things as �The Site of the Day�, �Top 5 Sites for this Month�, and so on.
In addition to the cached search results and statistics database, OmniSearch will also support WAP interfaces for the search engine. By using ASP.NET Mobile Architecture, the OmniSearch team is able to provide a wireless interface, that will customize itself according to the wireless browser that is being used.
The following parameters need to be setup in the web.config file, before anything will work correctly.
<!-- application specific settings -->
<appSettings>
<!-- Realative Location of Cache -->
<add key="CacheLocation" value="\cache" />
<!-- Directory Name of Website -->
<add key="DirectoryName" value="\OmniSearch" />
<!-- Static Location of Website -->
<add key="WebSiteLocation" value="D:\Websites\Development\OmniSearch" />
<!-- Proxy Settings if needed -->
<!-- Set a host to use as an HTTP proxy. If unset,
GoogleSearch will also check the
system properties and use those values. If those
are also unset, HTTP requests
go direct with no proxy.
-->
<add key="ProxyHost" value="" />
<!-- Set a port to use as an HTTP proxy.
Only used if proxyHost is also set. If
unset, port 80 is assumed to be the default.
-->
<add key="ProxyPort" value="" />
<!-- Set the username required for the HTTP proxy.
Only used if ProxyHost is also set. -->
<add key="ProxyUsername" value="" />
<!-- Set the password required for the HTTP proxy.
Only used if ProxyHost is also set. -->
<add key="ProxyPassword" value="" />
<!-- Google Configuration Information -->
<!-- Set the user key used for authorization by
Google SOAP server. This is a
mandatory attribute for all requests.
A key can be obtained from
http://www.google.com/apis/index.html
-->
<add key="GoogleKey" value="" />
<!-- Set the maximum number of results to be returned.
The number must be between 1 and 10
-->
<add key="GoogleMaxReturn" value="10" />
<!-- Enable or disable the "related-queries" filter.
Must be [True | False]
-->
<add key="GoogleFilter" value="False" />
<!-- Set the restrict.
This allows you to restrict the search to
a specific document store such as
"Penn State", "IST" or any argument that can
be used in a normal Google Search.
-->
<add key="GoogleRestrict" value="" />
<!-- Enable or disable SafeSearch. When SafeSearch
is turned on, sites and web pages
containing pornography and explicit sexual
content are blocked from search
results. While no filter is 100% accurate,
Google's filter uses advanced
proprietary technology that checks keywords
and phrases, URLs and Open Directory
categories.
Must be [True | False]
-->
<add key="GoogleSafeSearch" value="False" />
<!-- Domain to search -->
<add key="DomainToSearch" value="ist.psu.edu" />
<!-- Number of days the cache will stay in memory -->
<add key="ExpireTime" value="5" />
</appSettings>
The following is an example of the XSL file that can be used against the serialized form of the XML cached search from Google's Web Service. This XSL file displays the search contents, just like you would see on the Google search page. In the source code this file is called searchresults.xsl.
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" version="4.0"
encoding="iso-8859-1" indent="yes" />
<xsl:template match="/">
<xsl:apply-templates select="GoogleSearchResult" />
</xsl:template>
<xsl:template match="GoogleSearchResult">
<table width="100%" border="0" cellspacing="0"
cellpadding="0" bgcolor="#6699cc"
style="padding: 5px, 5px, 5px, 5px; margin-top: 5px;" ID="Table1">
<tr style="color:white;">
<td aling="left">Search <b>ist.psu.edu</b> for <b>
<xsl:value-of select="searchQuery" />
</b></td>
<td align="right">Results <b>
<xsl:value-of select="startIndex" />
</b> - <b>
<xsl:value-of select="endIndex" />
</b> of about <b>
<xsl:value-of select="estimatedTotalResultsCount" />
</b>.</td>
</tr>
</table>
<table border="0" cellspacing="0" cellpadding="0" ID="Table2">
<xsl:for-each select="resultElements/ResultElement">
<tr>
<td>
<b>
<a style="color:blue;"
href="/OmniSearch/forward.aspx?url={URL}">
<xsl:value-of select="title" />
</a>
</b>
</td>
</tr>
<tr>
<td>
<xsl:value-of select="snippet" />
</td>
</tr>
<tr>
<td>
<span style="color:green;"><xsl:value-of select="URL" />
- <xsl:value-of select="cachedSize" /></span>
<a style="color:gray;"
href="http://216.239.53.100/search?
q=cache%3A{URL}+{searchQuery}">
Cached</a>
<a style="color:gray;"
href="http://www.google.com/
search?q=related%3A{URL}">
Similar pages</a>
</td>
</tr>
<tr>
<td>
<br />
</td>
</tr>
</xsl:for-each>
</table>
<table width="100%" border="0" cellspacing="0"
cellpadding="0" bgcolor="#6699cc"
style="padding: 5px, 5px, 5px, 5px; margin-top: 5px;" ID="Table3">
<tr style="color:white;">
<td align="left">
<b>
<a style="color:white;"
href="/OmniSearch/search.aspx?search=
{searchQuery}&back={startIndex}">
� Back</a>
</b>
</td>
<td align="right">
<b>
<a style="color:white;"
href="/OmniSearch/search.aspx?
search={searchQuery}&forward={endIndex}">
Next �</a>
</b>
</td>
</tr>
</table>
</xsl:template>
</xsl:stylesheet>
The following code is all that is needed in order to implement the OmniSearch engine in any ASP.NET webpage. It took a lot of time, but I wanted to make it simple to use, but really powerful at the same time.
protected System.Web.UI.WebControls.Xml resultsXml;
private void Page_Load(object sender, System.EventArgs e)
{
search = new SiteSearch(Request.QueryString["search"]);
// if there is no start query then the program assumes that
// it wants to start the search at the begining
if (Request.QueryString["forward"] == null &&
Request.QueryString["back"] == null)
start = 1;
else if (Request.QueryString["forward"] != null &&
Request.QueryString["back"] == null)
start = Convert.ToInt32(Request.QueryString["forward"]) +1;
else if (Request.QueryString["forward"] == null &&
Request.QueryString["back"] != null)
start = Convert.ToInt32(Request.QueryString["back"]) -
Convert.ToInt32
(ConfigurationSettings.AppSettings["GoogleMaxReturn"]);
else
start = 1;
if(start < 1)
start = 1;
resultsXml.DocumentSource = search.DoSearch(start);
}
This is an example of what the ASP.NET page can look like when you put it all together. The following line of code is all you really need to implement the OmniSearch engine in any ASP.NET application that you would choose.
<asp:xml id="resultsXml" runat="server"
TransformSource="searchresults.xsl"></asp:xml>
I learned a couple of major things while implementing this code, they are listed as follows:
One thing that needs to be done before OmniSearch can be used.
I will be eventually integrating this engine into the OmniPortal project that I am also developing. More about OmniPortal can be found at Source Forge or http://beta.omniportal.net/ (Shameless plug).
| You must Sign In to use this message board. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 10 Feb 2003 Editor: Smitha Vijayan |
Copyright 2003 by Nick Berardi Everything else Copyright © CodeProject, 1999-2009 Web20 | Advertise on the Code Project |