Click here to Skip to main content
15,880,392 members
Articles / Web Development / ASP.NET
Article

A Simple HTML Token Parser

Rate me:
Please Sign up or sign in to vote.
3.90/5 (13 votes)
20 Jul 2008CPOL3 min read 45.6K   628   31   3
A very simple HTML template parser which replaces user-defined tokens with meaningful text.

Introduction

In developing my latest web site, I researched many CMS systems, and tested different ways of producing dynamic web content in Visual Studio. Nothing seemed to fit the bill. What I wanted was a dead simple way of displaying dynamic content - a simple API of sorts. I wanted to be able to edit basic HTML files to do this. After looking into methods of creating my own HTTPhandler and deciding it was overkill for my application, I recalled that Delphi had implemented an elegant solution: PageProducers.

Since recently defecting to the C#.NET camp, I decided to recreate a simple version of Delphi's PageProducer class. The HTML Template Parser I came up with provides an easy way to implement user defined HTML templates. It allows you to define your own tokens and replace those tokens at runtime in code with values you wish. It's not the most complex parser you've ever seen, but it gets the job done with a minimal amount of fuss.

Overview

The calling code instantiates an object of the TokenParser class, and to it assigns an event handler.

C#
parser = new TokenParser(fileName); // fileName is the path and name of the HTML file.
parser.OnToken += this.OnToken;

public void OnToken(string strToken, ref string strReplacement)
{
  if (strToken == "TESTME")
    strReplacement = "Moose Butts a-Flyin!";
  else
    strReplacement = "Huh?";
}

Once this is done, it is simply a matter of getting the parsed HTML from the TokenParser's ToString() method and returning it in the Response object.

Using the Code

The TokenParser Class

The TokenParser defines a delegate used to pass found tokens to the calling code. As the parser encounters tokens, it calls the OnToken event, and passes the token and a referenced string into which the implemented TokenHandler places the desired value.

C#
// ********************************************************************************
//     Document    :  TokenParser.cs
//     Version     :  0.1
//     Project     :  StrayIdeaz
//     Description :  This is a very simple HTML template parser. It takes a text file
//                    and replaces tokens with values supplied by the calling code
//                    via a delegate.
//     Author      :  StrayVision Software
//     Date        :  7/20/2008
// ********************************************************************************

using System;
using System.Collections.Generic;
using System.Text;
using System.IO;

namespace TemplateParser
{

  /// <summary />
  ///     TokenParser is a class which implements a simple token replacement parser.
  /// </summary />
  /// <remarks />
  ///     TokenParser is used by the calling code by implementing an event handler for
  ///     the delegate TokenHandler(string strToken, ref string strReplacement)
  /// </remarks />
  public class TokenParser
  {
    private String inputText;
    private String textSourceFile;

    public delegate void TokenHandler(string strToken, ref string strReplacement);
    public event TokenHandler OnToken;

The constructor accepts the path and the name of the source file to be parsed.

C#
public TokenParser(String sourceFile)
{
  textSourceFile = sourceFile;
}

ExtractToken parses a token in the format "[%TOKENNAME%]". It does not know about the meaning of the token, only that it is a character string enclosed in [% and %]. It returns the string between these tokens.

C#
private string ExtractToken(string strToken)
{
  int firstPos = strToken.IndexOf("[%")+2;
  int secondPos = strToken.LastIndexOf("%]");
  string result = strToken.Substring(firstPos, secondPos - firstPos);

  return result.Trim();
}

Parse() iterates through each character of the class variable inputText. inputText represents the contents of the HTML file to be parsed. Parse() returns a string representing inputText, with its tokens exchanged for the calling code's values.

C#
private String Parse()
{
  const string tokenStart = "[";
  const string tokenNext = "%";
  const string tokenEnd = "]";

  String outText = String.Empty;
  String token = String.Empty;
  String replacement = String.Empty;

  int i = 0;
  string tok;
  string tok2;
  int len = inputText.Length;

  while (i < len)
  {
    tok = inputText[i].ToString();
    if(tok == tokenStart)
    {
      i++;
      tok2 = inputText[i].ToString();
      if (tok2 == tokenNext)
      {
        i--;
        while (i < len & tok2 != tokenEnd)
        {
          tok2 = inputText[i].ToString();
          token += tok2;
          i++;
        }
        OnToken(ExtractToken(token), ref replacement);
        outText += replacement;
        token = String.Empty;
        tok = inputText[i].ToString();//i++;
      }
    }
    outText += tok;
    i++;
  }
  return outText;
}

The Content() method simply reads the HTML file and returns its unmolested text. This is useful if you have written an application which has tutorials and you want to display the unparsed HTML to demonstrate the use of tokens in templates.

C#
/// <summary />
///     Content() reads the text file specified in the constructor
///     and returns the unparsed text.
/// </summary />
/// <returns />
///     A string representing the unparsed text file.
/// </returns />
public String Content()
{
  string result;
  try
  {
    TextReader reader = new StreamReader(textSourceFile);
    inputText = reader.ReadToEnd();
    reader.Close();
    result = inputText;
  }
  catch (Exception e)
  {
    result = e.Message;
  }
  return result;
}

Now, we get to the important method: ToString(). Once you have implemented the OnToken event, you call ToString() to perform the actual parsing and get the parsed HTML text filled with your values.

C#
    /// <summary />
    ///     This is called to return the parsed text file.
    /// </summary />
    /// <returns />
    ///     A string representing the text file with all its tokens replaced by data
    ///     supplied by the calling code through the Tokenhandler delegate
    /// </returns />
    public override string ToString()
    {
      //TextReader reader;
      string result;
      try
      {
        TextReader reader = new StreamReader(textSourceFile);
        inputText = reader.ReadToEnd();
        reader.Close();
        result = Parse();
      }
      catch (Exception e)
      {
        result = e.Message;
      }
      return result;      
    }

  }
}

Now that we've seen the TokenParser class, let's take a look at how it is used. Below is listed the whole source for an ASP.NET code-behind page which implements the TokenParser.

Two class level private variables are declared: String pageText, which will hold the parsed HTML text, and TokenParser parser, which is our TokenParser object. In the Page_Load event, we get the path to our template file (this can be any path and file name you want). We then call LoadPage() to do the work.

Notice also that we have implemented a method called OnToken. This is our event handler. You may name it anything you want as long as it takes two parameters: string and ref string. It is in this event that the work of replacing tokens with meaningful values takes place.

These tokens may be called anything you want: [%MYVAR%], [%FOOBAR%], [%LATESTNEWS%], etc. Based on the token passed, you send back in the ref string the HTML code, the JavaScript, or any text you wish.

LoadPage() is the last method in our example. This is where the parser object is instantiated, the Ontoken event assigned, and the parser.ToString() method called to retrieve our parsed HTML template. From there, it is simply a matter of sending it via Response.Write() to the user's browser.

C#
public partial class _Default : System.Web.UI.Page
{
  private String pageText;
  private TokenParser parser;

  protected void Page_Load(object sender, EventArgs e)
  {
    string path = Server.MapPath("template.html");
    LoadPage(path);
  }

  public void OnToken(string strToken, ref string strReplacement)
  {
    if (strToken == "TESTME")
      strReplacement = "Moose Butts a-Flyin!";
    else
      strReplacement = "Huh?";
  }

  private void LoadPage(string fileName)
  {
    parser = new TokenParser(fileName);
    parser.OnToken += this.OnToken;
    pageText = parser.ToString();
    DisplayPage();
  }
}

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
CEO StrayVision Software
United States United States
StrayVision Software was founded by experienced software developer Scott Rosin. After many years of creating and maintaining software projects run by others, he became dissatisfied with the state of software design. Rather than focus on the user experience as the ultimate goal, most software was... and still is... written with the programmer's goals as the end result. Scott wanted to shift the focus from an elegant algorithm to an elegant user experience. Thus was born StrayVision Software.

Comments and Discussions

 
GeneralThere is a bug in this code if the token is the last text in the file Pin
tthompson200724-Feb-10 6:09
tthompson200724-Feb-10 6:09 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.