Click here to Skip to main content
15,897,334 members
Articles / Web Development / HTML

Keyword Highlighting with One Line of Code: Applied Use of HttpResponse.Filter in ASP.NET to Modify the Output Stream

Rate me:
Please Sign up or sign in to vote.
4.87/5 (19 votes)
3 Jan 2014CPOL6 min read 49.6K   2.2K   45  
HttpResponse.Filter post-processes the output of an ASP.NET page in order to modify the HTML document before it is sent to the client, similar to output buffering in PHP. The example wraps instances of a keyword on the page in an HTML element to have a highlighting style applied to it.
<title>Keyword Highlighting with One Line of Code</title>
<abstract>
    HttpResponseFilter post-processes the output of an ASP.NET page in order to modify
    the HTML document before it is sent to the client. The example wraps instances of 
    a keyword on the page in an HTML element to have a highlighting style applied to it.
</abstract>
<ul class="download">
	<li class="download"><a href="HttpResponseFilter/HttpResponseFilter.zip">Download HttpResponseFilter.zip</a> - 30.39 KB</li>
	<li><a href="HttpResponseFilter/HttpResponseFilter.Source.zip">Download HttpResponseFilter.Source.zip</a> - 121.09 KB</li>
</ul>
<div id="article">
    <h2>Introduction</h2>
    <p>
	    No doubt you have seen many web pages in which the results of a keyword-search
	    highlights the keyword in yellow, making it easy for the reader to find the
	    keyword in the context in which it was found. There are of course many ways to
        approach this task.
    </p>
    <p>This article discusses:</p>
    <ul>
        <li>Implementation of the (mostly) undocumented <code>HttpResponse.Filter</code> property</li>
        <li>Implementation of a simple search box to highlight a word or phrase on a page</li>
        <li>Use of <code>Regex.Replace</code> with a <code>MatchEvaluator</code> delegate</li>
    </ul>

    <h2>Background</h2>
    <p>
        This week when I approached the implementation of keyword highlighting, 
        I considered a few possible ways:
    </p>
    <ol>
	    <li>Client-side DOM manipulation with Javascript</li>
	    <li>Search and replace on the text to which I have programmatic access</li>
	    <li>
            An ASP.NET <a href="http://msdn.microsoft.com/en-us/library/bb398986.aspx">HTTP Module or HTTP Handler</a>, 
            compiled as a standalone assembly and installed in <code>Web.config</code>
        </li>
	    <li>
            Manipulating the output stream, similar to 
            <a href="http://ca.php.net/manual/en/ref.outcontrol.php">output buffering in PHP</a>
        </li>
    </ol>
    <p>
	    It was the last method that I decided to pursue, because it had the potential to
	    operate	independently of the page's code (unlike #2), wouldn't require 
	    processor-intensive client-scripting (unlike #1), and wouldn't require any
	    server-side configuration (unlike #3).
    </p>
    <p>
        The example site consists of a web page that displays the text from Charles Dickens' <i>Great Expectations</i>.
        In the upper-right corner of the page floats a search box into which you can enter a word or phrase.
        It also presents some options, such as case-sensitive searching, whole-word searching, and searching
        using regular expressions instead of literal text.
    </p>
    <div class="figure"><img src="HttpResponseFilter/no-highlight.png" alt="Screen shot of Great Expectations without highlight" /></div>
    <p>
        When a word or phrase is entered into the search box and the button clicked,
        the page is shown again with the search term highlighted throughout the document.
    </p>
    <div class="figure"><img src="HttpResponseFilter/highlighted.png" alt="Screen shot of Great Expectations with highlighting" /></div>

    <h2>Terminology</h2>
    <p>
        For the sake of clarity, I'll refer to the search term or keywords as the <b>needle.</b>
        Likewise, I'll refer to the text that is being searched as the <b>haystack</b>. This 
        nomenclature is also used throughout the code for consistency.
    </p>

    <h2>Using the code</h2>
    <div class="figure"><img src="HttpResponseFilter/HttpFilter.png" alt="Screen shot of Great Expectations with highlighting" /></div>
    <p>
        Earlier in the article I promised to add highlighting to a page with one line of code.
        Here is the code in context:
    </p>
    <pre lang="C#">
/// &lt;summary&gt;
/// Handles the Load event of the Page control.
/// &lt;/summary&gt;
/// &lt;param name="sender"&gt;The source of the event.&lt;/param&gt;
/// &lt;param name="e">The &lt;see cref="EventArgs"/&gt; instance containing the event data.&lt;/param&gt;
protected void Page_Load(object sender, EventArgs e)
{
    // Add some content from a resource.
    Content.Text = Properties.Resources.Great_Expectations__by_Charles_Dickens;

    if(IsPostBack)
    {
        // Implement a highlighter with one line of code:
        Response.Filter = new HighlightFilter(Response, Needle.Text)    // The magic line.
                                {
                                    IsHtml5 = false, 
                                    MatchCase = MatchCase.Checked, 
                                    MatchWholeWords = MatchWholeWords.Checked, 
                                    UseRegex = UseRegularExpressions.Checked
                                }; 

        // Don't try to highlight the search box.
        Needle.Text = string.Empty;
    }
}
    </pre>
    
    <p>
        As you can see, when the Web Form is posted back, the needle is retrieved 
        from <code>Needle.Text</code>. In the code-behind we construct a <code>HighlightFilter</code>, 
        passing it the <code>HttpResponse</code> object and the needle.
    </p>
    <p>
        I have also set some of the properties of <code>HighlightFilter</code> using an
        <a href="http://msdn.microsoft.com/en-us/library/bb384062.aspx">object initializer</a>.
        Most of the properties should be self-explanatory, like <code>MatchCase</code>,
        <code>MatchWholeWord</code>, and <code>UseRegex</code>. 
    </p>
    <p>
        The <code>IsHtml5</code> property wraps instances of the needle in the <code>&lt;mark&gt;</code> 
        element, for which it was intended. If it is false, a <code>div</code> with its class set to
        "highlight" is used instead. For greater control, one can explicitly set the values of the
        <code>OpenTag</code> and <code>CloseTag</code> properties. For ultimate control you can
        subscribe to the <code>Highlighting</code> event and modify the supplied <code>Haystack</code> using the
        supplied <code>Needle</code>, or even subclass <code>HighlightFilter</code> entirely.
    </p>
    <p>
        Of course the usefulness of post-processing in this manner need not be limited to highlighting.
        Using the <code>Filter</code> class one could subscribe to the <code>Filtering</code> event
        to modify the output stream, or subclass <code>Filter</code> and override the protected
        <code>OnFilter</code> method. There are numerous applications including:
    </p>
    <ul>
        <li>obfuscation</li>
        <li>minification</li>
        <li>altering the output of sealed classes</li>
        <li>translation (e.g. RSS → HTML)</li>
        <li>insertion of common code (e.g. reverse master page)</li>
        <li>…</li>
    </ul>
    <p>If you find other uses, please share with a comment.</p>

    <h2>How it Works</h2>
    <p>I would need to somehow intercept the output stream, <code>Page.Response.OutputStream</code>.</p>
    <p>
	    A bit of searching led me to the <code>Filter</code> property of the <code>HttpResponse</code> class.
	    The <a href="http://msdn.microsoft.com/en-us/library/system.web.httpresponse.filter.aspx">documentation
	    for the property</a> leaves quite a bit to the imagination. The property is assigned a <code>Stream</code>
	    that filters writes, and the example refers to a magical (i.e. undocumented) 
	    <code>UpperCaseFilterStream</code> that takes the property itself as a parameter to the constructor, 
	    and ta da! Hmm… (Had I bothered to find and unpack <code>Samples.AspNet.CS.Controls</code> maybe I
        would have solved this one.
    </p>
    <p>
        I created the <code>Filter</code> class, which takes the <code>HttpResponse</code> object as a
        parameter to the constructor. The class itself inherits <code>Stream</code>, but the implementation
        of the abstract class simply invokes methods and properties of the <code>HttpResponse</code>
        object's <code>OutputStream</code> stream, with the exception of 
        <code>Write(byte[] buffer, int offset, int count)</code>. The overridden <code>Write</code> method
        decodes the buffer to a string using the response's <code>ContentEncoding</code>, applies a filter,
        and re-encodes and writes out the buffer to the <code>OutputStream</code>.
    </p>
    <p>
        The <code>Filter</code> class by itself doesn't do anything useful, but its potential it unlimited.
        To make it filter something, one needs to subclass it and override <code>OnFilter</code>, 
        or instantiate it and subscribe to the <code>Filtering</code> event, which passes a 
        <code>FilterEventArgs</code> object containing the buffered string to be manipulated.
    </p>
    <p>
        For example, to implement needle highlighting, <code>HighlightFilter</code> inherits <code>Filter</code>,
        overriding <code>OnFilter</code> and adding some properties and the <code>Highlighting</code> event.
    </p>
    <p>
        The new <code>OnFilter</code> method uses <code>Regex.Replace</code> to replace instances of the
        needle in the haystack. It does this using the invocation that takes a <code>MatchEvaluator</code>,
        a delegate that is called for each match that is found. This is perfect for this use because if 
        <code>MatchWholeWords</code> is true, the characters that bound the needle will be replaced in kind,
        and the case of the match will not be altered (i.e. using <code>String.Replace</code> would replace
        the casing of all matches with that of the needle.
    </p>
    <p>
        If <code>UseRegex</code> is false, the needle is simply escaped with <code>Regex.Escape</code>
        instead of using an alternate means of searching and replacing.
    </p>
    <p>
        I was initially concerned that using <code>Regex</code> for replacement with a 
        <code>MatchEvaluator</code> would be prohibitively slow, but replacement of common words in
        <i>Great Expectations</i> (just over one megabyte) takes a few millisecond on my Core
        i7-2600K and hopefully not too much more on a typical web server. Interestingly, enabling
        "Match Whole Word", increases this to several seconds.
    </p>

    <h2>Points of Interest</h2>
    <p>
	    In my first attempt, I derived a new class from <code>MemoryStream</code> and assigned it to
	    the <code>Filter</code> property. I overrided the <code>Write</code> method and manipulated it by
	    wrapping instances of the keyword in a new element to which as CSS style could be assigned.
    </p>
    <p>
	    Inspection of the contents of the stream demonstrated that it worked quite nicely, and the class called 
	    <code>base.Write</code> to complete the task, but this resulted in zero bytes sent to the client. The 
        <a href="http://msdn.microsoft.com/en-us/library/system.web.httpresponse.filter.aspx">sample application</a>
        suggests maybe one needs to write out the bytes individually. Instead I used my class to wrap the
        output stream.
    </p>	

    <h2>Acknowledgements</h2>
    <p>
        Thank you to <a href="http://www.gutenberg.org/">The Gutenberg Project</a> for the free distribution of
        <i><a href="http://www.gutenberg.org/ebooks/1400">Great Expectations</a></i> and over 36,000 other works;
        and of course to <a href="http://en.wikipedia.org/wiki/Charles_dickens">Charles Dickens</a> (1812-1870) himself.
    </p>

    <h2>History</h2>
    <p>October 31, 2011: Version 1.0.0.x</p>
</div>

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Engineer Robotic Assistance Devices / AITX
Canada Canada
Yvan Rodrigues has 30 years of experience in information systems and software development for the industry. He is Senior Concept Designer at Robotic Assistance Devices

He is a Certified Technician (C.Tech.), a professional designation granted by the Institute of Engineering Technology of Ontario (IETO).

Yvan draws on experience as owner of Red Cell Innovation Inc., Mabel's Labels Inc. as Manager of Systems and Development, the University of Waterloo as Information Systems Manager, and OTTO Motors as Senior Systems Engineer and Senior Concept Designer.

Yvan is currently focused on design of embedded systems.

Comments and Discussions