Click here to Skip to main content
Click here to Skip to main content

A Simple CSS Parser

, 25 Feb 2012
Rate this:
Please Sign up or sign in to vote.
A simple CSS parser designed to work with iTextSharp for HTML to PDF generation

Introduction

Cascading Style Sheets allow developers to create nice user interfaces for the web. They are easy to build, use, and maintain. iTextSharp can take advantage of CSS when using its built in HTML to PDF functionality. Getting the style sheet information from the CSS into iTextSharp requires the developer to read the CSS file and convert it to Dictionary consumable by iTextSharp. This article will illustrate a simple solution for performing just that task. The included solution includes Unit Tests and an ASP.Net project which demonstrate how to use the CSSParser.

Background

While working on an HTML to PDF utility I found the need to parse Cascading Style Sheets. There are many CSS parsers on the internet but none fit my needs. I created this simple Regular Expression based CSS parser in C# to facilitate PDF generation in iTextSharp. The requirements for the CSS Parser are as follows:

Requirements

  1. Read a CSS file
  2. Store CSS in a Collection
  3. Query for the classes and their properties
  4. Query for the elements and their properties
  5. Easy to maintain and enhance
  6. Easily feed the style information into iTextSharp to turn HTML into PDF
  7. It should be lean
  8. Something another developer can use

Using the code

The CSSParser inherits from a generic List of KeyValuePair. The key will be the CSS selector. The value will be another list of key value pairs. The key here is the CSS attribute name. The value will be the CSS property value. I used a generic List instead of a Dictionary because Cascading Style Sheets can have the same selector or attributes listed multiple times.

public partial class CSSParser : List<KeyValuePair<String,List<KeyValuePair<String,String>>>>, ICSSParser

The core of the CSS parser is a regular expression which I found on Stack Overflow (http://stackoverflow.com/a/2694121/899290). The CSSGroups regular expression will take the stylesheet and break it up into named groups. Before parsing the CSS the CSSComments regular expression will be used to remove CSS comments from the file.

public const String CSSGroups = @"(?<selector>(?:(?:[^,{]+),?)*?)\{(?:(?<name>[^}:]+):?(?<value>[^};]+);?)*?\}";

public const String CSSComments = @"(?<!"")\/\*.+?\*\/(?!"")";

private Regex rStyles = new Regex(CSSGroups, RegexOptions.IgnoreCase | RegexOptions.Compiled);

The Read method is responsible for parsing the values in the style sheet and filling the generic List. It will use the .Net Regex class to remove any comments and populate the collections.

public void Read(String CascadingStyleSheet)
{
    this.StyleSheet = CascadingStyleSheet;

    if (!String.IsNullOrEmpty(CascadingStyleSheet))
    {
        //Remove comments before parsing the CSS. Don't want any comments in the collection.
        MatchCollection MatchList = rStyles.Matches(Regex.Replace(CascadingStyleSheet, 
            RegularExpressionLibrary.CSSComments, String.Empty));
        foreach (Match item in MatchList)
        {
            //Check for nulls
            if (item != null && item.Groups != null && 
                item.Groups[SelectorKey] != null && 
                item.Groups[SelectorKey].Captures != null && 
                item.Groups[SelectorKey].Captures[0] != null && 
                !String.IsNullOrEmpty(item.Groups[SelectorKey].Value))
            {
                String strSelector = item.Groups[SelectorKey].Captures[0].Value.Trim();
                var style = new List<KeyValuePair<String,String>>();

                for (int i = 0; i < item.Groups[NameKey].Captures.Count; i++)
                {
                    String className = item.Groups[NameKey].Captures[i].Value;
                    String value = item.Groups[ValueKey].Captures[i].Value;
                    //Check for null values in the properies
                    if (!String.IsNullOrEmpty(className) && !String.IsNullOrEmpty(value))
                    {
                        className = className.TrimWhiteSpace();
                        value = value.TrimWhiteSpace();
                        //One more check to be sure we are only pulling valid css values
                        if (!String.IsNullOrEmpty(className) && !String.IsNullOrEmpty(value))
                        {
                            style.Add(new KeyValuePair<String,String>(className, value));
                        }
                    }
                }
                this.Add(new KeyValuePair<String,List<KeyValuePair<String,String>>>(strSelector, style));
            }
        }
    }
}

Once the list is populated it’s a simple matter of using LINQ or Lambda expressions to pull the information you need. The Classes and Elements properties expose the values of the style sheet as a Dictionary which can be fed to iTextSharp.

public Dictionary<String, Dictionary<String,String>> Classes
{
    get
    {
        if (classes == null || classes.Count == 0)
        {
            this.classes = this.Where(cl => cl.Key.StartsWith("."))
                .ToDictionary(cl => cl.Key.Trim(new Char[] { '.' }), cl => cl.Value
                    .ToDictionary(p => p.Key, p => p.Value));
        }

        return classes;
    }
}

public public Dictionary<String, Dictionary<String,String>> Elements
{
    get
    {
        if (elements == null || elements.Count == 0)
        {
            elements = this.Where(el => !el.Key.StartsWith("."))
                .ToDictionary(el => el.Key, el => el.Value
                    .ToDictionary(p => p.Key, p => p.Value));
        }
        return elements;
    }
}

Using the CSS Parser

The CSSParser gives you two options to read a Cascading Style Sheet, read a CSS file or a string. The ReadCSSFile method will read a CSS file and populate the collections. You can read a String containing CSS information by calling the Read method or passing the CSS values to the constructor.

void lnkParseCSSFile_Click(object sender, EventArgs e)
{
    CSSParser parser = new CSSParser();
    parser.ReadCSSFile(Server.MapPath("~/CSSParserStyle.css"));
    //Display the Original CSS with some formating for the web
    this.divOriginalCSS.InnerHtml = parser.StyleSheet.FixLineBreakForWeb().FixTabsForWeb().FixSpaceForWeb();
    //Display the parsed CSS
    this.divParsedCSS.InnerHtml = parser.ToString();
    this.spnOriginalCSSLength.InnerText = parser.StyleSheet.Length.ToString();
    this.spnParsedCSSLength.InnerText = this.divParsedCSS.InnerHtml.Length.ToString();
}

Points of Interest

The CSSParser Elements and Classes properties target iTextSharp version 5.x

History

  1. Version 1.0- Initial Release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Israel Cris Valenzuela
Software Developer (Senior) Rain Bird Corporation
United States United States
I am a C# & Sql application developer. I currently work for Rain Bird as a Senior Application Developer and Technical Lead. I run some development projects and have fun learning new technologies.
I teach martial arts at night to kids through the Young Champions of America youth organization.
Follow on   Twitter

Comments and Discussions

 
QuestionMy vote of 5 PinprofessionalVijay G. Yadav16-Jun-14 21:20 
Questionhow to get @media type? PinmemberJerry Ho28-Jan-13 22:42 
SuggestionRe: how to get @media type? PinmemberIsrael Cris Valenzuela31-Jan-13 7:31 
AnswerRe: how to get @media type? PinmemberJonathan Wood27-Aug-13 5:07 
QuestionHow to skip duplicate keys from list Pinmemberntrraorao18-Jan-13 2:48 
AnswerRe: How to skip duplicate keys from list PinmemberIsrael Cris Valenzuela31-Jan-13 7:25 
Questioncan you send me source code for multiple css file using c#.net winform appliation PinmemberKay Pee Singh16-Oct-12 18:21 
SuggestionRe: can you send me source code for multiple css file using c#.net winform appliation PinmemberIsrael Cris Valenzuela31-Jan-13 7:33 
Questionusing CSS Parser in itext sharp [modified] Pinmembersachin_ kulkarni16-Sep-12 20:53 
AnswerRe: using CSS Parser in itext sharp PinmemberIsrael Cris Valenzuela31-Jan-13 7:15 
QuestionTutorials for using CSS Parser Pinmembertqnst6-Mar-12 4:53 
AnswerRe: Tutorials for using CSS Parser PinmemberIsrael Cris Valenzuela6-Mar-12 5:47 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140821.2 | Last Updated 25 Feb 2012
Article Copyright 2012 by Israel Cris Valenzuela
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid