Click here to Skip to main content
Click here to Skip to main content

ASP.NET Common Web Page Class Library - Part 4

By , 6 Apr 2006
 

Table of contents

Introduction

This is the fourth in a series of articles on a class library for ASP.NET applications that I have developed. It contains a set of common, reusable page classes that can be utilized in web applications as-is to provide a consistent look, feel, and set of features. New classes can also be derived from them to extend their capabilities. The features are all fairly modular and may be extracted and placed into your own classes too. For a complete list of articles in the series along with a demonstration application and the code for the classes, see Part 1 [^].

This article describes the only non-page derived class in the library, PageUtils, along with the remaining methods of the BasePage class that are somewhat similar in nature to those contained in it. PageUtils contains a set of utility functions that you may find useful in any ASP.NET application. Each of the features is described below. The class itself is sealed and all public properties and methods are static. As such, the constructor is declared private as there is no need to instantiate the class.

HTML encoding

The first method presented is HtmlEncode, which can be called to encode an object for output to an HTML page. It encodes any HTML special characters as literals instead of letting the browser interpret them. In addition, it replaces multiple spaces, tabs, and line breaks with their HTML equivalents thus preserving the layout of the specified text. The size of expanded tab characters can be altered using the TabSize property. Set it to the number of non-breaking spaces that should replace the tab character. The default is four.

If the object is null (Nothing), results in an empty string, or is a single space, a non-breaking space is returned. In conjunction with the above-described behavior, this is useful for displaying database fields that contain HTML special characters, formatting, or nulls such as those with the text or memo data type.

As an added bonus, if the encodeLinks parameter is true, URLs, UNCs, and e-mail addresses are converted to hyperlinks whenever possible using the EncodeLinks method (see below). If false, they are not converted and will be rendered as normal text. As shown below, the code is fairly simple and requires little in the way of additional explanation:

public static string HtmlEncode(Object objText, bool encodeLinks)
{
    StringBuilder sb;
    string text;

    if(objText != null)
    {
        text = objText.ToString();

        if(text.Length != 0)
        {
            // Create tab expansion string if not done already
            if(expandTabs == null)
                expandTabs = new String(' ',
                    PageUtils.TabSize).Replace(" ", " ");

            // Encode the string
            sb = new StringBuilder(
                HttpUtility.HtmlEncode(text), 256);

            sb.Replace("  ", "  ");  // Two spaces
            sb.Replace("\t", expandTabs);
            sb.Replace("\r", "");
            sb.Replace("\n", "<br>");

            text = sb.ToString();

            if(text.Length > 1)
            {
                if(!encodeLinks)
                    return text;

                // Try to convert URLs, UNCs, and e-mail
                // addresses to links.
                return PageUtils.EncodeLinks(text);
            }

            if(text.Length == 1 && text[0] != ' ')
                return text;
        }
    }

    return " ";
}

Link encoding

The second method presented is EncodeLinks. This method is called by HtmlEncode but can also be called directly by your code. It takes the passed string and finds all URLs, UNCs, and e-mail addresses and converts them to clickable hyperlinks suitable for rendering in an HTML page. For UNC paths, it will include any text up to the first whitespace character. If the path contains spaces, you can enclose the entire path in angle brackets (i.e., <\\Server\Folder\Name With Spaces>) and the encoder will include all text between the angle brackets in the hyperlink. The angle brackets will not appear in the encoded hyperlink:

public static string EncodeLinks(string text)
{
    // We'll create these on first use and keep them around
    // for subsequent calls to save resources.
    if(reURL == null)
    {
        reURL = new Regex(@"(((file|news|(ht|f|nn)tp(s?))://)|" +
             @"(www\.))+[\w()*\-!_%]+.[\w()*\-/.!_#%]+[\w()*\-/" +
             @".!_#%]*((\?\w+(\=[\w()*\-/.!_#%]*)?)(((&|&(?" +
             @"!\w+;))(\w+(\=[\w()*\-/.!_#%]*)?))+)?)?",
             RegexOptions.IgnoreCase);
        reUNC = new Regex(@"(\\{2}\w+(\\((&.{2,8};|" +
             @"[\w\-\.,@?^=%&:/~\+#\$])*[\w\-\@?^=%&/~\+#\$])?)" +
             @"*)|((\<|\<)\\{2}\w+(\\((&.{2,8};|" +
             @"[\w\-\.,@?^=%&:/~\+#\$ ])*)?)*(\>|\>))",
             RegexOptions.IgnoreCase);
        reEMail = new Regex(@"([a-zA-Z0-9_\-])([a-zA-Z0-9_\-\." +
             @"]*)@(\[((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]" +
             @"[0-9]|[0-9])\.){3}|((([a-zA-Z0-9\-]+)\.)+))(" +
             @"[a-zA-Z]{2,}|(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|" +
             @"[1-9][0-9]|[0-9])\])", RegexOptions.IgnoreCase);
        reTSUNC = new Regex(
             @"\.?((&\#\d{1,3}|&\w{2,8});((&\#\d{1,3}|&" +
             @"\w{2,8}))?)+\w*$");

        urlMatchEvaluator = new MatchEvaluator(OnUrlMatch);
        uncMatchEvaluator = new MatchEvaluator(OnUncMatch);
    }

    // Do the replacements
    text = reURL.Replace(text, urlMatchEvaluator);
    text = reUNC.Replace(text, uncMatchEvaluator);
    text = reEMail.Replace(text,
                   @"<a href='mailto:$&'>$&</a>");
    return text;
}

As you can see, the method uses regular expressions to search for and replace each URL, UNC, and e-mail address. The expressions used should catch just about all variations of each type. The regular expression objects are created on first use and are kept around for subsequent calls to save a little time. For URLs and UNCs, the following match evaluators handle the actual work of the replacement:

// Replace a URL with a link to the URL. This checks for a
// missing protocol and adds it if necessary.
private static string OnUrlMatch(Match match)
{
    StringBuilder sb = new StringBuilder("<a href='", 256);
    string url = match.Value;

    // Use default HTTP protocol if one wasn't specified
    if(url.IndexOf("://") == -1)
        sb.Append("http://");

    sb.Append(url);
    sb.Append("' target='_BLANK'>");
    sb.Append(url);
    sb.Append("</a>");

    return sb.ToString();
}

// Replace a UNC with a link to the UNC. This strips off any
// containing brackets (plain or encoded) and flips the slashes.
private static string OnUncMatch(Match match)
{
    StringBuilder sb = new StringBuilder("<a href='file:", 256);
    string unc = match.Value;

    // Strip brackets if found. If it has encoded brackets,
    // strip them too.
    if(unc[0] == '<')
        unc = unc.Substring(1, unc.Length - 2);
    else
        if(unc.StartsWith("<"))
            unc = unc.Substring(4, unc.Length - 8);

    // Move trailing special characters outside the link
    Match m = reTSUNC.Match(unc);
    if(m.Success == true)
        unc = reTSUNC.Replace(unc, "");

    sb.Append(unc);
    sb.Append("' target='_BLANK'>");

    // Replace backslashes with forward slashes
    sb.Replace('\\', '/');

    sb.Append(unc);
    sb.Append("</a>");

    if(m.Success == true)
        sb.Append(m.Value);

    return sb.ToString();
}

A regular expression match evaluator is like a callback. Each time the regular expression finds a match, it calls the evaluator. Its job is to take the found text and modify it in any way necessary and then return it to the regular expression so that it can be used to replace the original text. In these two cases, the match evaluators add the anchor tag and ensure that the links are formatted appropriately.

Converting validation messages to hyperlinks

In my applications, I have come to favor the validation summary control to contain all validation error messages generated by the page. It keeps them all in one location and does not adversely affect the layout of the controls in the form when they are made visible. The drawback is that on a form with a large number of controls and validation conditions, it can sometimes be difficult to match each message to its control, especially if the form is long enough to require scrolling around to find it. As such, I have added functionality to the BasePage class to automatically convert all validation control error messages that are set to appear in a validation summary control to clickable hyperlinks that will take you directly to the offending field by giving it the focus:

protected virtual void ConvertValMsgsToLinks()
{
    BaseValidator bv;

    foreach(IValidator val in this.Validators)
    {
        bv = val as BaseValidator;

        if(bv != null && bv.Visible == true &&
          bv.ControlToValidate.Length > 0 &&
          bv.Display == ValidatorDisplay.None)
            bv.ErrorMessage = MakeMsgLink(bv.ControlToValidate,
                         bv.ErrorMessage, this.MsgLinkCssClass);
    }
}

A call to ConvertValMsgsToLinks is done as the very first step in the overridden Render method. It iterates over the page's Validators collection. The validator control must be visible, must have its ControlToValidate property set to a control ID, and must have its Display property set to None indicating that it will appear in a validation summary control. If all of the necessary conditions are met, a call is placed to the MakeMsgLink method to convert the error message to a hyperlink.

Note that since this occurs within the rendering step, changes to the error messages are not retained. If the page posts back, the error messages are restored from view state and will be in their non-hyperlink form. When the page renders during the postback, the messages will be converted to hyperlinks again provided that they still meet the necessary conditions. I chose this approach so that it is transparent to users of the class, is non-intrusive, and will not break any code that expects the messages to be in their non-hyperlink form. Derived classes can override this method to extend or suppress this behavior.

Note: If extracting the above method for use in your own classes, be sure to override the page's Render method and call it. If not, the links will not be converted:

public string MakeMsgLink(string id, string msg, string cssClass)
{
    string newClass;

    // Don't bother if it's null, empty, or already in the form
    // of a link.
    if(msg == null || msg.Length == 0 || msg.StartsWith("<a "))
        return msg;

    StringBuilder sb = new StringBuilder(512);

    // Add the anchor tag and the optional CSS class
    sb.Append("<a ");

    newClass = (cssClass == null) ?
        this.MsgLinkCssClass : cssClass;

    if(newClass != null && newClass.Length > 0)
    {
        sb.Append("class='");
        sb.Append(newClass);
        sb.Append("' ");
    }

    // An HREF is included that does nothing so that we can use
    // the hover style to do stuff like underline the link when
    // the mouse is over it. OnClick performs the action and
    // returns false so that we don't trigger IE's
    // OnBeforeUnload event which may be tied to data change
    // checking code.

    // NOTE: OnPreRender registers the script containing the
    // function. Tell the function to use the "Find Control"
    // method to locate the ID. That way, it works for controls
    // embedded in data grids.
    sb.Append("href='javascript:return false;' " +
        "onclick='javascript: return BP_funSetFocus(\"");
    sb.Append(id);
    sb.Append("\", true);'>");
    sb.Append(msg);
    sb.Append("</a>");

    return sb.ToString();
}

The MakeMsgLink method will convert the passed text into a hyperlink that transfers focus to the control with the specified ID. The Set Focus script, described in part one of this series, controls setting the focus to the control. As such, the specified ID can be an exact match or the ending part of an ID (see part one for details). An optional CSS class name can be specified that will be applied to the hyperlink. If null, it uses the one defined by the MsgLinkCssClass property. By default, it is set to the value of the BasePage.MsgLinkCssName constant which is currently set to the style name ErrorMsgLink. The class name should appear in the stylesheet associated with the application. As noted, a dummy href is added to the link so that you can add a hover style to the CSS class. For example, in my applications, the error messages display as normal text and show an underline as the mouse passes over them.

Conclusion

Although small, the PageUtils class contains some very helpful features. The validation message link feature of BasePage can also make the validation summary control more user friendly. Hopefully, you will find this class and the others in the library, or parts of them, as useful as I have.

Revision history

  • 04/02/2006

    Changes in this release:

    • Reworked the URL encoding regular expression in PageUtils so that it includes a few more protocols, includes all valid URL characters, handles URLs with parameters, and does not include special characters after the URL.
    • Breaking change: Property and method names have been modified to conform to the .NET naming conventions with regard to casing (PageUtils.HtmlEncode, BasePage.MsgLinkCssClass, and BasePage.MsgLinkCssName).
  • 11/26/2004
    • Made some changes to the URL and UNC link encoding regular expressions to make them more accurate.
  • 12/01/2003
    • Initial release.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Eric Woodruff
Software Developer (Senior)
United States United States
Member
Eric Woodruff is an Analyst/Programmer for Spokane County, Washington where he helps develop and support various applications, mainly criminal justice systems, using Windows Forms (C#) and SQL Server as well as some ASP.NET applications.
 
He is also the author of some open source projects and shareware components for .NET including:
 
The Sandcastle Help File Builder - A front end and project management system that lets you build help file projects using Microsoft's Sandcastle documentation tools.

Sandcastle Styles - A joint effort to publish patches and enhancements to Microsoft's Sandcastle documentation tools along with some supporting documentation and tools.
 
Image Map Controls - Windows Forms and web server controls that implement image maps.
 
PDI Library - A complete set of classes that let you have access to all objects, properties, parameter types, and data types as defined by the vCard (RFC 2426), vCalendar, and iCalendar (RFC 2445) specifications. A recurrence engine is also provided that allows you to easily and reliably calculate occurrence dates and times for even the most complex recurrence patterns.
 
Windows Forms List Controls - A set of extended .NET Windows Forms list controls. The controls include an auto-complete combo box, a multi-column combo box, a user control dropdown combo box, a radio button list, a check box list, a data navigator control, and a data list control (similar in nature to a continuous details section in Microsoft Access or the DataRepeater from VB6).
 
For more information see http://www.EWoodruff.us

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
Generala small improvementmemberSergei Shevyrov26 Sep '06 - 12:58 
If you run the code on the following text with the dot character at the end "http://www.codeproject.com/aspnet/EWSWebPt4.asp.", the link it will generate, will lead to the "Page not Found, The page you requested cannot be found."
 
To eliminate this problem, you need to append the metacharacter "\b" at the end of the regular expression that searches/encodes http(s)/ftp/... links so that "non-word" characters which may occur at the end of a link text are not included in the generated hyperlink:
 
reURL = new Regex(@"(((file|news|(ht|f|nn)tp(s?))://)|" +
@"(www\.))+[\w()*\-!_%]+.[\w()*\-/.!_#%]+[\w()*\-/" +
@".!_#%]*((\?\w+(\=[\w()*\-/.!_#%]*)?)(((&|&(?" +
@"!\w+;))(\w+(\=[\w()*\-/.!_#%]*)?))+)?)?" +
@"\b",
RegexOptions.IgnoreCase);
 
Cheers.
 
Sergei
GeneralRe: a small improvementmemberEric Woodruff26 Sep '06 - 15:20 
Thanks for the fix.
 
Eric

GeneralProblem with EncodeLinks with URLs containing ampersandsmemberpitaandfreddy23 Feb '06 - 11:41 
If you have a URL containing an ampersand, the encodelinks method doesn't work correctly. It does not recognize the & and following text as part of the URL, even though they are, and thus does not create the link correctly.
 
For example:
 
http://www.test.com?mytest=true&mytest2=false
 
The link stops when it reaches the & but should include the entire query string.
 
-- modified at 17:42 Thursday 23rd February, 2006
GeneralRe: Problem with EncodeLinks with URLs containing ampersandsmemberEric Woodruff23 Feb '06 - 16:06 
Thanks for the report. I'll add it to my list. I'm getting ready to make some updates for .NET 2.0 so I'll take care of it at the same time (probably in the next week or two).
 
Eric

GeneralBug in HtmlEncode with bEncodeLinks = truesussScott Beeler18 Oct '05 - 17:48 
Calling HtmlEncode with bEncodeLinks = true doesn't work if the text includes a UNC path with spaces. The problem is that the text gets converted so that the angle braces around the UNC path get converted to < and > prior to the call to EncodeLinks.

GeneralRe: Bug in HtmlEncode with bEncodeLinks = truesussScott Beeler18 Oct '05 - 19:08 
Never mind. I seem to be having the same problem with UNC paths with spaces as ksrir reported earlier. I'm not seeing the UNC paths turned into hyperlinks correctly regardless of whether I call HtmlEncode or EncodeLinks directly if the UNC paths have a space.
 
I have tried sending textbox text, text area values, text area innerhtml, and text area innertext, all with the same incorrect results.
 
I'm not sure if I'm doing something wrong or if there is a bug in the current posting of EncodeLinks.
 

Frown | :(
GeneralRe: Bug in HtmlEncode with bEncodeLinks = truememberEric Woodruff19 Oct '05 - 15:54 
The code in the article download hasn't changed with the exception of adding "https" to the URL encoding regular expression. It's possible that the UNC encoder is having problems. If you've got an example or two that fail, I can take a look at them.
 
Eric

GeneralRe: Bug in HtmlEncode with bEncodeLinks = truesussScott Beeler19 Oct '05 - 18:07 
Wow, this is just embarassing. While typing a response showing my examples that supposedly weren't working, I realized why. They weren't valid UNC paths. If I use valid UNC paths, low and behold, it works.
 
doh! D'Oh! | :doh:
 
Nice work and thanks!
QuestionReinventing the wheel?memberKelmen6 Dec '04 - 19:32 
Isn't that there is a class System.Web.HttpUtility providing the HTML and URL encode and decoding?
Confused | :confused:
AnswerRe: Reinventing the wheel?memberEric Woodruff7 Dec '04 - 15:31 
Yes there is, and my method makes use of the one that does basic HTML encoding. It doesn't do URL encoding in the manner that you are thinking. The wheel has not been re-invented, just improved. If you read the article you'll see that my method doesn't just HTML encode the data, it also provides several additional features such as converting tabs and spaces to non-breaking spaces, line feeds to <br> tags to preserve formatting, the size for tab expansion is adjustable, if the passed string is null or blank it returns a non-breaking space to preserve formatting (i.e. null database fields displayed in a table), etc. The best feature is that you can also have it convert URLs, UNCs, and e-mail addresses to anchor tags that render as clickable links when the text is displayed in the browser.
 
Eric

GeneralRe: Reinventing the wheel?memberkevferron15 Nov '06 - 12:36 
But what you're saying is that your HtmlEncoder does more than HtmlEncode, which doesn't make a lot of sense.. I'm not sure i see an improvement here, rather an obfuscation and a cluttering.
GeneralRe: Reinventing the wheel?memberKevin C Ferron16 Nov '06 - 14:15 
you can score my comment as a 1, but it doesn't change the fact that i'm correct
GeneralRe: Reinventing the wheel?memberlogan133728 Aug '07 - 12:12 
Seems to me that Eric's method does a lot more than the HttpUtility method, so I don't see how it's reinventing the wheel. His claim to be an improvement seems justified in my view.
GeneralConverting UNC Path to linkmemberksrir25 Jul '04 - 18:31 
http://www.codeproject.com/aspnet/EWSWebPt4.asp
 
I read the above article and i am looking for a way to convert UNC path to html links. I am not getting the desired result when the UNC path contains space between words. Appreciate your help
 
my regular exp is as follows (Taken from the article)
reUNC = new Regex(@"(\\{2}\w+(\\([\w\-\.,@?^=%&" +
     @":/~\+#\$]*[\w\-\@?^=%&/~\+#\$])?)*)|((\<|" +
     @"\&lt;)\\{2}\w+(\\([\w\-\.,@?^=%&:/~\+" +
     @"#\$ ]*)?)*(\>|\&gt;))",
     RegexOptions.IgnoreCase);
 
When i give path as
<\\Server\Folder\Name With Spaces> the link stops at "Name".
 
Appreciate your help in this.
TIA
 
ks
GeneralRe: Converting UNC Path to linkmemberEric Woodruff26 Jul '04 - 10:15 
There is a combination of a regular expression and a match evaluator that work together to convert the UNCs to hyperlinks. Below is some updated code for them. I have applied a couple of fixes since the article was released but haven't had a chance to update it yet:
 
[EDIT: I can't find a way to get rid of the two stupid smiley icons that show up below when viewed on Code Project. Both are supposed to be ';' followed by ')' in the regex.]
 
// Declare and create the objects
Regex reUNC, reTSUNC;
MatchEvaluator UNCMatchEvaluator;
 
reUNC = new Regex(@"(\\{2}\w+(\\((&.{2,8};|" +
    @"[\w\-\.,@?^=%&:/~\+#\$])*[\w\-\@?^=%&/~\+#\$])?)*)|" +
    @"((\<|\<)\\{2}\w+(\\((&.{2,8};|" +
    @"[\w\-\.,@?^=%&:/~\+#\$ ])*)?)*(\>|\>))",
    RegexOptions.IgnoreCase);
 
reTSUNC = new Regex(
    @"\.?((&\#\d{1,3}|&\w{2,8});((&\#\d{1,3}|&\w{2,8}))?)+\w*$");
 
// See below for the eval function
UNCMatchEvaluator = new MatchEvaluator(OnUNCMatch);
 
// Do the replacement
strText = reUNC.Replace(strText, UNCMatchEvaluator);
 

// Replace a UNC with a link to the UNC.  This strips off any
// containing brackets (plain or encoded) and flips the slashes.
private static string OnUNCMatch(Match match)
{
    StringBuilder strLink = new StringBuilder("<a href='file:", 256);
    string strUNC = match.Value;
 
    // Strip brackets if found.  If it has encoded brackets,
    // strip them too.
    if(strUNC[0] == '<')
        strUNC = strUNC.Substring(1, strUNC.Length - 2);
    else
        if(strUNC.StartsWith("<"))
            strUNC = strUNC.Substring(4, strUNC.Length - 8);
 
    // Move trailing special characters outside the link
    Match m = reTSUNC.Match(strUNC);
    if(m.Success == true)
        strUNC = reTSUNC.Replace(strUNC, "");
 
    strLink.Append(strUNC);
    strLink.Append("' target='_BLANK'>");
 
    // Replace backslashes with forward slashes
    strLink.Replace('\\', '/');
 
    strLink.Append(strUNC);
    strLink.Append("</a>");
 
    if(m.Success == true)
        strLink.Append(m.Value);
 
    return strLink.ToString();
}
 
Eric

GeneralRe: Converting UNC Path to linkmemberksrir26 Jul '04 - 19:26 
Hi Eric,
Thanks for the response. Yet the above reg expression fails when there is   space in the path. Eg.,(<\\test\my test\first.doc>) The link stops like this \\test\my
 
So I modified it like this (included \s)
 
reUNC = new Regex(@"(\\{2}\w+(\\((&.{2,8};|" +     
             @"[\s\w\-\.,@?^=%&:/~\+#\$])*[\s\w\-\@?^=%&/~\+#\$])?)*)|" +     
             @"((\<|\<;)\\{2}\w(\\((&.{2,8};|" +     
             @"[\s\w\-\.,@?^=%&:/~\+#\$ ])*)?)*(\>|\>;))",
             RegexOptions.IgnoreCase);
 
But the problem is that, any text after the actual UNC path is also reflected as link like in   (\\test\my test\first.doc> some more text goes here)
 
Appreciate your help.
 
TIA,
ks
GeneralRe: Converting UNC Path to linkmemberEric Woodruff27 Jul '04 - 7:19 
Please re-read my first reply. There's more to it than just a regular expression. It has to be used in conjunction with the match evaluator and the match evaluation function too. Also examine the EncodeLinks() method in the PageUtils class. It shows how to do it the same way.
 
Eric

GeneralRe: Converting UNC Path to linkmemberksrir27 Jul '04 - 17:42 
Thanks again for the help. I had declared a htmltext area and was passing its "value" to the function EncodeLinks instead of the "innertext", which is wrong!.
 
Now all works well.
Again thanks very much for your help.
ks
GeneralThankmemberBill Gob28 Mar '04 - 22:51 
Thank you.
 

Hollywood Movie Sexy Star...
http://www.bangkokstation.com/movies/sexy.aspx[^]
 

Roll eyes | :rolleyes:
 
asp.net developer

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130523.1 | Last Updated 7 Apr 2006
Article Copyright 2003 by Eric Woodruff
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid