Click here to Skip to main content
Licence CPOL
First Posted 16 Mar 2007
Views 19,088
Downloads 78
Bookmarked 20 times

Using Regular Expression for Parsing

By | 16 Mar 2007 | Article
Derived currently supports RTF and HTML format
Screenshot - SyntaxDemo.gif

Introduction

This article discusses regular expression syntax used for analysis and analysis of the results shows the text to HTML or RTF format.

Background

Regular Expression Analysis:

public virtual bool Analyze(string ACode)
{
    if (FSyntaxItems.Count <= 0) return false;
    if (ACode == null) return false;
    AnalyzeResluts.Clear();
    string vCode = ACode;
    bool vFind = true;
    while (vFind && (vCode.Length > 0))
    {
        vFind = false;
        foreach (SyntaxItem vSyntaxItem in FSyntaxItems)
        {
            if (Regex.IsMatch(vCode, vSyntaxItem.Pattern, vSyntaxItem.Options))
            {
                AnalyzeResluts.Add(new AnalyzeReslut(vSyntaxItem,
                    Regex.Match(vCode, vSyntaxItem.Pattern,
                    vSyntaxItem.Options).Value));
                vCode = Regex.Replace(vCode, vSyntaxItem.Pattern, "",
                    vSyntaxItem.Options);
                vFind = true;
                break;
            }
        }
    }
    return true;
}
  • SyntaxEngineClass: Class-based parsing engine SyntaxItems type property inheritance by adding items grammar analysis
  • SyntaxHighlight: Class-based highlight engine HighlightItem type property inheritance by adding items color and font style

Machine translation of text, barcode or see:

public class SyntaxItem
{
    private string FPattern;
    private RegexOptions FOptions; 
    private string FName; 
    private int FIndex; 

    public string Pattern { get { return FPattern; } } 
    public RegexOptions Options { get { return FOptions; } }
    public string Name { get { return FName; } }
    public int Index { get { return FIndex; } }

    public SyntaxItem(string APattern, RegexOptions AOptions,
        string AName, int AIndex)
    {
        FPattern = APattern;
        FOptions = AOptions;
        FName = AName;
        FIndex = AIndex;
    }
}

public class AnalyzeReslut
{
    private SyntaxItem FItem;
    private string FBlock; 

    public SyntaxItem Item { get { return FItem; } }
    public string Block { get { return FBlock; } }

    public AnalyzeReslut(SyntaxItem AItem, string ABlock)
    {
        FItem = AItem;
        FBlock = ABlock;
    }
}

Refer to the following regular expression code written in other languages:

SyntaxItems.Add(new SyntaxItem(@"^\s+", RegexOptions.None,
    "Whitespace", SyntaxItems.Count));
SyntaxItems.Add(new SyntaxItem(@"^\/\/[^\n]*[\n]?", RegexOptions.None,
    "LineComment", SyntaxItems.Count));
SyntaxItems.Add(new SyntaxItem(@"^\/\*.*?\*\/", RegexOptions.None,
    "MultiComment", SyntaxItems.Count));

Add regular expression must be by '^', not to write expression, otherwise it would match the length of the dead cycle 0.

History

  • 17th March, 2007: Version 1.0

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

wjhu111



China China

Member

zswang

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board. (secure sign-in)
 
Search this forum  
 FAQ
    Noise  Layout  Per page   
  Refresh
GeneralMy vote of 5 PinmemberArchKaine18:38 27 Jun '11  
GeneralMy vote of 1 PinmemberArchKaine14:12 31 Mar '11  
GeneralMultiline patterns don't work Pinmembershakeupkga6:28 27 Sep '07  
GeneralRe: Multiline patterns don't work Pinmemberwjhu11115:36 27 Sep '07  
GeneralSuggestion for readability. PinmemberArchKaine18:24 20 Mar '07  
GeneralRe: Suggestion for readability. Pinmemberwjhu1110:59 21 Mar '07  
GeneralRe: Suggestion for readability. PinmemberArchKaine3:59 21 Mar '07  

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Mobile
Web03 | 2.5.120517.1 | Last Updated 17 Mar 2007
Article Copyright 2007 by wjhu111
Everything else Copyright © CodeProject, 1999-2012
Terms of Use
Layout: fixed | fluid