Click here to Skip to main content
Click here to Skip to main content

Adding Regular Expressions to Your App with Regex++

By , 17 Jun 2002
 

Introduction

What is a regular expression? In a nutshell, regular expressions provide a simple way to transform raw data into something useable. In the preface of Mastering Regular Expressions (O'Reilly & Associates), Jeffrey Friedl writes:

"There's a good reason that regular expressions are found in so many diverse applications: they are extremely powerful. At a low level, a regular expression describes a chunk of text. You might use it to verify a user's input, or perhaps to sift through large amounts of data. On a higher level, regular expressions allow you to master your data. Control it. Put it to work for you. To master regular expressions is to master your data."

You may not know this, but regular expressions are found in the Microsft Visual Studio text search tool. It provides a very powerful way to search for complex patterns in your code (or any text file for that matter). Here are a few links on the web to help you get started with regular expressions if you've never used them before.

Getting Started

Regular expressions, while seemingly difficult to learn, are one of the most powerful tools in a programmer’s arsenal, yet many programmers never take advantage of them. You can certainly write your own text parsers that will get the job done, but doing it that way takes more time, is far more error prone, and is nowhere near as fun (IMHO).

Regex++ is a regular expression library available from http://www.boost.org. Boost provides free peer-reviewed portable C++ source libraries. Take a look at the website to learn more. We are only concerned with Regex++ for our purposes, but you may find many of their libraries useful. The original Regex++ author's website is http://ourworld.compuserve.com/homepages/John_Maddock/

Installing Regex++

Note: The following instructions will only work if you have Visual Studio 6 or 7 installed.

To install Regex++, complete the following steps (Detailed instructions are also availabe in the Regex++ download itself):

  1. Download Regex++ from the original authors website. This way you will only get the regex library and not the entire boost library.
  2. Unzip to the directory C:\Regex++ ( Type the path C:\Regex++ into the Extract to: field as in the image below )
  3. Open a command prompt
  4. Change directory to C:\Regex++\libs\regex\build In this directory you will find several make files. The one you are interested in is vc6.mak.
  5. In order to use environment settings from Visual Studio, you must run the batch file vcvars32.bat. This should be in your path, so you shouldn't have to specify a full path to it. Just type vcvars32.bat into your command prompt window.
  6. Type:
    nmake -fvc6.mak
    It will take a little while to build.
  7. Type:
    nmake -fvc6.mak install
    (installs the libs and dlls in the appropriate places)
  8. Type:
    nmake -fvc6.mak clean
    (You may get some errors with this one. I did, but you can just delete the intermediate files manually, if need be)

Now that your library is built and in place, it is ready to use. The project that I've included above is intended to demonstrate how you can simply parse HTML. All you need to do now is open the project and ensure that project settings are pointing to the appropriate regex++ lib and include directories. But first a short discussion

Note: To add the Regex++ library to your project select Project | Settings.... In the ensuing dialog, select the C/C++ tab. In the Category drop down list, select Preprocessor. In the Additional include directories: edit box enter C:\Regex++. Now select the Link tab. In the Category drop down list, select Input. In the Additional library path: edit box enter C:\Regex++.

Parsing HTML

HTML parsers are nothing new. There is really no reason someone should have to write their own (that I can think of, at least) since the wheel has already been invented. That being said, the example we are going to be using does just that--parses HTML. I do this because parsing HTML provides a good pedagogical example. Specifically, it parses form elements in an HTML document. This is a fairly complex task to accomplish, however, using regular expressions makes it simple. We are going to want our parser to be generic enough to parse what will amount to key value pairs in any given input field. For instance, in the HTML:

<input type="text" name="address" size=30 maxlength = "100">
we would like to just supply the key name ( e.g. type, name, size, etc. ) and have the regex return that key's corresponding value ( e.g. text, address, 30, etc. ). Notice that some values have quotes and some don't. Some use white space and others don't. These are things we're going to have to account for in our regular expression. We also have to account for a different order for each parameter. For instance this:
<input type="text" name="address" size=30 maxlength = "100">
is the same as this:
<input name="address" type="text" maxlength="100" size="30">

In the sample application example I build a single string from the HTML input file (we'll read the whole file into a CString variable). While this may cause problems on very large files, for our purposes we'll assume that the file is fairly small. We'll need the whole string in order to match across line barriers--but more on that later.

ParseFile Method

In the ParseFile method we:

  1. Pass in the filename of the HTML file to parse (must contain a <FORM> and input elements (e.g. INPUT, SELECT, TEXTAREA) or you won't see any output. )
  2. Read the whole file into a string
  3. Create a Regular Expression object ( RegEx )
  4. Call Grep on the file string for the pattern we want and place the matches we found into an STL vector
  5. Iterate through each item that was placed into the vector
  6. Call GetActualType() which creates another regex to acquire which type we found (e.g. INPUT, TEXTAREA, SELECT)
  7. Call GetValue() passing the key (e.g. type, name, etc.)
  8. Generate and print out a string with the values we've acquired

Note: The code snippets in this article contain regular expressions that use escape characters. Because these are C/C++ strings being used, these escape characters have to be escaped twice. That is, the regex whitespace escape character (\s) will actually look like this: \\s. And a quotation mark would look like this: \\\" -- the first escapes the backslash and the second escapes the quotation mark.

BOOL CRegexTestDlg::ParseFile(CString filename)
{
    if (filename.IsEmpty())
    {
        return FALSE;
    }
    CString finalstring;


    this->m_mainEdit.SetWindowText("");

    CStdioFile htmlfile;
    CString inString = "";
    CString wholeFileString = "";
    std::string wholeFileStr = "";

    // Read entire file into a string.
    try{
        if (htmlfile.Open(filename, CFile::modeRead | 
                            CFile::typeText, NULL))
        {
            while (htmlfile.ReadString(inString))
            {
                wholeFileString += inString;
            }
            htmlfile.Close();
        }
    }
    catch (CFileException e)
    {
        MessageBox("The file " + filename + 
                    " could not be opened for reading", 
                    "File Open Failed",
                    MB_ICONHAND|MB_ICONSTOP|MB_ICONERROR );
        return FALSE;
    }

    // Need to convert string to a STL string for use in RegEx
    wholeFileStr = wholeFileString.GetBuffer(10);

    // Create our regular expression object
    // TRUE means that we want a match to be case-insensitive
    boost::RegEx expr("(<\s*(textarea|input|select)\\s*[^>]+>[^<>]*(</(select|textarea)>)?)", 
                      TRUE);

    // Create a vector to hold all matches
    std::vector<std::string> v;

    // Pass the vector and the STL string that hold the file contents
    // to the RegEx.Grep method.
    expr.Grep(v, wholeFileStr);

    // Create char array to hold actual type (e.g. input, select, textarea).
    char actualType[100];

    // vector v is now full of all matches. We iterate through them.
    for(int i = 0; i < v.size(); i++)
    {
        // Get the object at the current index and typecast to string
        std::string line = (std::string)v[i];

        // Get a pointer to the beginning of the character arrray
        const char *buf = line.c_str();

        // Create some temporary storage variables
        char name[100];
        char typeName[100];

        // Build output string.
        finalstring += "input, textarea, select?: ";
        GetActualType(buf, actualType);
        finalstring += actualType;
        finalstring += " -- ";
        GetValue("name", buf, name);
        finalstring += "name: ";
        finalstring += name;
        finalstring += " -- ";
        finalstring += "input type is: ";

        // If it's an input, get the type of input
        // (e.g. text, password, checkbox, radio, etc.)
        if(_stricmp("input", actualType) == 0)
        {
            GetValue("type", buf, typeName);
            finalstring += typeName;
        }
        // Otherwise, it doesn't apply.
        else
        {
            finalstring += "N/A";
        }
        finalstring += "\r\n";
        
    }


    // Populate text field with items
    this->m_mainEdit.SetWindowText(finalstring);

    return TRUE;

}

In this method notice specifically the lines:

// Create our regular expression object
boost::RegEx expr("(<\\s*(textarea|input|select)\\s*[^>]+>[^<>]*(</(select|textarea)>)?)", 
                  TRUE);

// Create a vector to hold all matches
std::vector<std::string> v;

// Pass the vector and the STL string that hold the file contents
// to the RegEx.Grep method.
expr.Grep(v, wholeFileStr);

The expr object gets constructed with a pattern. I will break down the pattern as follows:

(<\s*                          // Match on an open tag "<" and zero or
                               // more white space characters
          
(textarea|input|select)\s+[^>]+> // 1. Match on either textarea, input, or select
            1          2  3    
                                 // 2. look for one or more spaces next
                                 // 3. Match on one or more characters that
                                 //   are not a ">" until we find the end ">"
                                   
[^<>]*                         // Match on zero or more characters that are not
                               // "<" or ">"
                                   
(</(select|textarea)>)?) // Match on an end tag "</" and either a select or
                         // a text area. The question mark means that everything
                         // inside the quotes is optional(e.g. 0 or 1 occurrences).

Note: In this previous description escape characters are not escaped twice. This is the way the actual regular expression would look if you printed it out.

Just as a reminder the regex operators above mean:

Character Description Usage
* Match Zero or more of previous expression. "\s*" -- zero or more white space chars
+ Match one or more of previous expression "\s+" -- one ore more white space chars
[^] Negation set. "[^<]" -- Match any char that is not a less than "<" char. Can be a list of characters to negate (e.g. [^<>/] -- match anything not a less than, a greater than, or a forward slash)

The Grep method takes a reference to the vector created above it. After the Grep call, the vector will contain all matches found. Using Grep() as opposed to Search() (which is another useful method), will allow you to match across line barriers. This is important for a file you read in--especially HTML files that allow for a fairly loose format. For instance this:

<input type="text" name="name">

is the same as this:

<input type="text"
        name="name">

in any web browser. We need to account for this. If you are wondering about case-sensitivity, look at the instantiation of the RegEx object. The second parameter is a boolean. This indicates whether you would like it to be case-insensitive--which we do in the example code.

If you would like further information about the boost Regex++ library API, take a look at:

GetActualType Method

In the GetActualType method we extract the type of input field we're dealing with on the current line. Remember that in the ParseFile method we made sure that there was at least one input type of some sort, so this line is pretty much guaranteed to have one. Here is the method implementation:

BOOL CRegexTestDlg::GetActualType(const char *line, char *type)
{
    // Create a pattern to look for.
    char* pattern = "<\\s*((input|textarea|select))\\s*";
    // Create RegEx object with pattern. Should be case-insensitive
    RegEx exp(pattern, TRUE);

    // Search for the pattern. Use Search, not Grep since we have a single line.
    if(exp.Search(line))
    {
        // If found, copy the text of the first expression match to the
        // type variable.
        strcpy(type, exp[1].c_str());
        return TRUE;
    
    }
    // We didn't find anything. Just copy an empty string.
    strcpy(type, "");
    return FALSE;
}

Take a look at the pattern itself:

char* pattern = "<\\s*((input|textarea|select))\\s*";

Here we are saying look for an opening brace "<" and possibly some white space. Then look for either "input", "textarea", or "Select". Then there may be some more white space. Notice the two sets of parentheses around input|textarea|select. The inner set of parens tell us that this is a set of possible values. The pipe (|) (a.k.a. "or") here tells us that a match could contain any one of the three values. The outer parens captures what we did find into a special variable. So, if you ran this HTML code through our parser:

<input type= "text" name="email" size="20">

exp[1] would now contain the word "input". If your line had other parens for capturing a part of the match, they would be placed in exp[n] where n is the current set of parens counted left to right, outside to inside.

GetValue Method

In the GetValue method we pass in a key to look for and a pointer to the variable we want to populate with the value.

void CRegexTestDlg::GetValue(char *key, const char *str, char *val)
{
    char* tmpStr = "\\s*=\\s*\\\"?([^\"<>]+)\\\"?";

    char final[100];
    // We need to build the string so we know exactly what we're looking for.
    strcpy(final, key);
    strcat(final, tmpStr);

    // Create the RegEx object with the pattern.
    RegEx exp(final);
    // Search for the
    if(exp.Search(str))
    {
        // If found, copy what we found.
        strcpy(val, exp[1].c_str());
    }
    else
    {
        // Otherwise copy a string with the no<key> where <key> is the key passed in.
        sprintf(val, "no%s", key);
    }
}

Take a look at this expression:

char* tmpStr = "\\s*=\\s*\\\"?([^\"<>]+)\\\"?";

This is our most complex pattern yet. First we look for some possible whitespace, an equals sign, and some more possible whitespace. Then we're looking for an opening quote. The question mark means 0 or 1 of the previous expression, so if the HTML didn't include an opening quote, we are accounting for that. That is if the line looked like either of the following (notice the quotation marks), it would still find a match:

<input type="text" name="email">
<input type=text name=email>

Next we're looking for any character(s) except a quotation mark ("), an opening brace (<), or a closing brace (>). This is our value. Notice that there are parens around this value because we want to capture that value into our special variable exp[n]. Next we are looking for a closing quotation mark and a possible close quote.

This is the end of our need for regular expressions. We now have the value we were looking for and can format it and output it in the list box. What you do with the values is up to you, but now you have all you need to parse HTML accurately and effectively. The example code may need some tweaking, but in general it gets the job done.

Running The Example

The example application I've included parses an HTML file that contains a form. For convenience sake, I've included an HTML form file in the project. The filename is contact_form.html and it can be found in the root directory of the project. When you run the application, simply click the "Browse..." button and select this file. Then click "Try It!"

Conclusion

While we could have built our parser using strtok or other tokenizers, these are not completely ideal for HTML since HTML can be so free form (e.g. a space here, quotes there, but not there, line wrap, etc.). Regular expressions are perfectly suited for just this sort of text parsing.

Regex++ is a very robust regular expression library that you will find very useful in your applications. Take a look at the example project and familiarize yourself with regular expression syntax. This will give you the ability to create powerful text parsers with minimal coding and will enable you to "master your data".

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

perlmunger
Web Developer
United States United States
Member
Matt Long is the Director of Technology for Skye Road Systems, Inc. in Colorado Springs, Colorado. He provides software architecture consulting services to small businesses. To contact Matt ( perlmunger ) send an email to matt@skyeroadsystems.com.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralRe: Problem with regular expressionmemberperlmunger7 Jan '05 - 4:50 
That is the same thing you said in your first post. It doesn't make any more sense the second time around. I can see at least one problem with your regex, though, so I will try to help with that.
 
You seem to by trying to escape the exclamation mark which doesn't make any sense. Exclamation marks are not special characters. Here is how I would interpret your regex ignoring the escape you have on your exclamation mark as follows:
 
Find an opening curly brace followed by a quotation mark followed by zero or more characters that are anything except a quotation mark followed by a quotation mark followed by an exclamation ponit followed by a single quote followed by zero or more characters that are anything except a single quote followed by a single quote followed by a closing curly brace.
 
The actual regex should look like this \{"[^"]*"![[^']*'\}
 
This regex would match something like this:
 
{"Bob was here"!'And so was Marge'}
 
However it will also match this:
 
{""!''}
 
If you want to ensure it matches something inside your quotation marks and single quotes, change the * (match zero or more) with + (match one or more).
 
Hope that helps. Let me know if you have further questions.
 
-Matt
 
------------------------------------------
 
The 3 great virtues of a programmer:
Laziness, Impatience, and Hubris.
--Larry Wall
GeneralRe: Problem with regular expressionsussMihai Ciumeica28 Jun '05 - 5:10 
I think what the original poster wanted was something along the lines of ("[^"]")|('[^']'), that will search for strings enclosed by quotation marks of single quotes. He must have mistook the '!' for the '|'.
GeneralProblems at compile timemembercosmin_ciuc@hotmail.com14 Sep '04 - 2:08 
I'm trying to use the Regexp library inside a VC6 project that uses another library too. But, I'm getting the following compiler error when I try to include :
 
c:\program files\microsoft visual studio\vc98\include\new(35) : error C2059: syntax error : 'string'
c:\program files\microsoft visual studio\vc98\include\new(35) : error C2091: function returns function
c:\program files\microsoft visual studio\vc98\include\new(35) : error C2809: 'operator new' has no formal parameters
c:\program files\microsoft visual studio\vc98\include\new(36) : error C2059: syntax error : 'string'
c:\program files\microsoft visual studio\vc98\include\new(37) : error C2091: function returns function
c:\program files\microsoft visual studio\vc98\include\new(37) : error C2556: 'void *(__cdecl *__cdecl operator new(void))(unsigned int,const struct std::nothrow_t &)' : overloaded function differs only by return type from 'void *(__cdecl *__cdecl op
erator new(void))(unsigned int)'
c:\program files\microsoft visual studio\vc98\include\new(35) : see declaration of 'new'
d:\work\regexp\boost\detail\allocator.hpp(279) : fatal error C1506: unrecoverable block scoping error
 
Can you help me, please?
GeneralRe: Problems at compile timememberperlmunger14 Sep '04 - 5:04 
I think it's trying to link against the wrong library or something. Make sure that your STL includes preceed your MFC includes. If that doesn't work, then do a search on google and paste in your error messages. I find this to be an effective way to find discussions on specific errors.
 
Best Regards.
 
-Matt
 
------------------------------------------
 
The 3 great virtues of a programmer:
Laziness, Impatience, and Hubris.
--Larry Wall
GeneralFor VC 7.1 usersmemberAnthony_Yio28 Jul '04 - 20:02 
Hello perlmunger,
 
I have just downloaded the regex++ from the link you pointed out in this article. However, after some checking, I found that Dr Maddock does not update the library as regular as to compare with the version in Boost.
I had compared the number of makefiles avail for regex++ in boost 1.31 with the one in the regex++ author's web page. It seems that the newer version of regex++ comes with an additional makefile named vc71.mak for VC7.1 users.
So, I guess the proper place to download regex++ would be from the boost web site instead?
 


 
Sonork 100.41263:Anthony_Yio

GeneralRe: For VC 7.1 usersmemberperlmunger29 Jul '04 - 4:39 
It hasn't been my top priority to update this article, though I intend to. I think you are right and I will change the link once I do get a chance.
 
Thank you.
 
-Matt
 
------------------------------------------
 
The 3 great virtues of a programmer:
Laziness, Impatience, and Hubris.
--Larry Wall
GeneralRe: For VC 7.1 usersmemberAnthony_Yio29 Jul '04 - 17:15 
BTW, thank you for your great article.

 
Sonork 100.41263:Anthony_Yio

GeneralRe: For VC 7.1 usersmemberblizzymadden19 Jul '05 - 15:09 
Looks like this doesn't work with VC7.1. I was able to build the library, but I get nothing but link errors. The instructions are silly because they say you don't need to specify any lib folder or include any lib files in your project, but that obviously is not the case.
GeneralRe: For VC 7.1 usersmemberAnthony_Yio27 Jul '05 - 16:53 
Try to use the regex++ in the boost library instead. The one in Maddock web site was not up to date. (the time when I check it) Rebuild the lib with the makefile specifically for VC7.1. It works for me.
 
Sonork 100.41263:Anthony_Yio

Generalboost_regex_vc6_mdid.dll not foundmemberHockey11 May '04 - 13:58 
EDIT: Nevermind I figured it out...thansk again for the great intro to boost Smile | :)
 
I am getting this error, any ideas how I would correct it?
 
It says reinstalling the application may fix this problem...
 
I have searched my hardrive for the file mentioned and it is not found...
 
Do I have to compile the regex library also? Isn't this done automagicaly when you run the required MAK files...?
 
Also, I had some difficulties in getting everything to compile fine because I had no idea (still don't how to use boost) I had t dig through your source code and find the
 
using namespace boost;
 
// Required by regex boost library
#include
#include
#include
 
IMHO You may want to include this as a step in your article Smile | :)
 
Cheers Smile | :)
 

 
How do I print my voice mail?
GeneralBroken linksmemberDavidCrow1 Apr '04 - 9:51 
The two links to www.boost.org (right above the GetActualType paragraph) do not work.
 

"The pointy end goes in the other man." - Antonio Banderas (Zorro, 1998)


GeneralWellsussAnonymous29 Mar '04 - 17:54 
How can I statically link to the library, I dont think that everyone will have the dll on their machine.
 
As right now, I am working on a MFC dialog based application.
 
If I build using the setup as on the site, I can build fine and things work on my machine, but as soon as I go on another machine it starts asking for the dlls.
 
If I select to link Statically to the MFC then it works fine however,not if I select to link to MFC as a shared libarary.
 
Thanks again for all your help
 

GeneralPlease help in using the sample...memberSeve Ho2 Dec '03 - 15:37 
Hi, I am a C++ beginner and failed to work out the sample of this acticle.
 
I am using Visual C++ 7 and installed boost as stated in the acticle.
 
When i try to complie and run the project, an error msg appear as follow:
 

"
RegexTest.exe - Unable To Locate Component
 
This application has failed to start because boost_regex_vc7_mdid.dll was not found. Re-installing the application may fix this problem."
 
It seems to me that the source is complied successfully but failed to find the "boost_regex_vc7_mdid.dll" in runtime...I am not sure but it may be the path settings problem.
 
Does anyone have any suggestion or solution to my problems?
GeneralRe: Please help in using the sample...memberperlmunger5 Dec '03 - 9:46 
I apologize for the inconvenience, however, this article (and subsequent project) has not been updated to work with Visual C++ 7 yet. I am trying to find the time to do just that, but am terribly busy lately.
 
If you feel so inclined, you can follow the buid instructions that come with boost::regex to build the missing dll that is mentioned in your error message. There is a separate make file in the boost distribution build directory just for this purpose. In all likelihood (if you followed the instructions in *this* article), you only built the boost_regex_vc6 dll and not the vc7 dll.
 
Thanks and good luck.
 
-Matt
 
------------------------------------------
 
The 3 great virtues of a programmer:
Laziness, Impatience, and Hubris.
--Larry Wall
AnswerRe: Please help in using the sample... [modified]memberAlireza_nemat26 Dec '06 - 19:41 
if you want to get this dll please write your EMail.
i will send to you this dll or send Email to me:arna4458@yahoo.com
 

-- modified at 12:36 Wednesday 27th December, 2006
QuestionHow to print LPCTSTR in c++?sussAnonymous12 Sep '03 - 13:17 
Hi,
 
I am pretty confused by the string types yet. How can I print a LPCTSTR variable in c++, I mean to print it out readable?
 
For example,
 
LPCTSTR mystr = (LPCTSTR).....;
 
Can I use cout or printf to print it?
 
cout<<mystr<<endl;
or
printf("%s\n", mystr);
 

Thanks,
Peter
AnswerRe: How to print LPCTSTR in c++?memberperlmunger12 Sep '03 - 18:14 
I don't see why not. Have you tried?
 
Frankly, I don't normally respond to anonymous posters because if you don't care to take the time to log in and indicate who you are, you probably are just shooting off a question without caring who's time you waste.
 
Just this once, however, I will try it out and see if you actually check back and see my answer.
 
A LPCTSTR is simply a const TCHAR*. A TCHAR is simply an MFC alias for char. However, if your code has the unicode pre-processor flag set, MFC TCHARs become wide-chars which are unsigned shorts.
 
I hope you're not asking a question about something that could have been easily tested.
 
-Matt
 
p.s. A simple google search yielded many links with answers to this question. The first thing any serious programmer should learn is to investigate the answer to a question before asking it. Of course we all get hasty from time to time, but I'm not even sure you considered finding it on your own.
 
------------------------------------------
 
The 3 great virtues of a programmer:
Laziness, Impatience, and Hubris.
--Larry Wall
QuestionHow to use exceptionsmemberHans Dietrich30 Aug '03 - 10:47 
First of all, thank you for this article. I have been trying to incorporate regex++ in my app, and in testing it I accidentally entered a bad regular expression, which caused an assert. So I am trying now to add exception handling. Please take a look at this and tell me if I am on right track:
    try
    {
        // Create our regular expression object
        boost::RegEx expr(m_strRegexp, FALSE);
 
        // Pass the vector and the STL string
        // to the RegEx.Grep method.
        expr.Grep(v, stdstr);
    }
    catch (const boost::bad_expression& be)
    {
        m_List.AddString(_T("ERROR: bad_expression"));
        const char *buf = be.what();
        m_List.AddString(buf);
        MessageBeep((UINT)-1);
    }
    catch (const std::exception& e)
    {
        m_List.AddString(_T("ERROR: std exception"));
        const char *buf = e.what();
        m_List.AddString(buf);
        MessageBeep((UINT)-1);
    }
    catch (...)
    {
        m_List.AddString(_T("ERROR: unknown exception"));
    }

AnswerRe: How to use exceptionsmemberperlmunger30 Aug '03 - 11:45 
First you need to understand the difference between an assertion and an excetpion. They are different. Assertions are used to ensure that a particular statement is true or false (must happen or the program fails). They are generally only useful for programmers while building their applications. An exception, on the other hand, is used to handle a problem that may happen, but can be anticipated and dealt with when the application is production code. Your excetpion handling looks fine to me. If your program caused an assertion, however (as you stated in your question), then you are not going to be able to fix the problem with exceptions. You have to find the line of code where the assertion failed using the debugger and find out which statement failed to fulfill the requirements of the assertion. Usually when an assertion fails, there's a problem with the code.
 
Hope that helps. Let me know if you need more clarification or have further questions.
 
-Matt
 
------------------------------------------
 
The 3 great virtues of a programmer:
Laziness, Impatience, and Hubris.
--Larry Wall
GeneralRe: How to use exceptionsmemberHans Dietrich30 Aug '03 - 13:03 
Thanks for the quick reply. I probably should not have called it an assertion. The error was reported by the "Microsoft Visual C++ Runtime Library" and the text of the error said that the runtime was terminated abnormally. Again, this was due to a badly formed re. For an example, try "ab(", and this is the error you will get. Fortunately, the try...catch handler catches it.
 
Best wishes,
Hans

QuestionUsing wide characters?sussAnonymous29 Jul '03 - 4:24 
Website seems to say it supports wide characters, but I can't find any examples or documentation that uses wide characters.
Generalgetting text from custom tags..memberMuhammad Ahmed25 Jun '03 - 4:38 
Hi:
First of all thanks for a usefull article i am parsing a file containg custom tags as follows..
<:: 1 some text here... ::>
<:: 2 some text here... ::>
<:: 3 some text here... ::>
<:: 4 some text here... ::>
i want to extract the text between these tags...
to get each pair of tag into vector i am using the following reg-exp
"<:Frown | :( *?)::>" it is working fine now each vector contains something like this
 
<:: some text here... ::>
 
now i am iterating throgh each vector to get text within each pair using following regular exp
"[^<][^:][^:](.*)[^:][^:][^>]" //this expression is not working ok can u help me plaese..Smile | :)
Regards
Muhammad Ahmed
 

 
ahmed
GeneralRe: getting text from custom tags..memberperlmunger25 Jun '03 - 6:58 
It would help if you would show me some code. But the regex you need to extract the text out is this "<::\s+([^\s]+)\s+::>" .
 
So if your RegEx variable was named "exp", the value you want is now in exp[1] because of the capturing parentheses.
 
Here's some code:
// create a vector to store captured strings
std::vector<std::string> capturedStrings;
// assume vector v is populated with the
// list of strings like <:: 1 some text here ::>
for( int i = 0; i < v.size(); ++i )
{
     std::string line = (std::string)v[i];
     RegEx exp( "<::\\s+([^\\s]+)\\s+::>", TRUE );
     if( exp.Search(line) )
     {
          /// found a match. do something with exp[1]
          capturedStrings.push( exp[1] );
     }
}
// Now capturedStrings is populated with the strings
// inside your tags. Do whatver you want with them now.
Just to clarify, what the regex means is this: "look for <:: followed by one or more whitspace characters, then capture anything that is not whitspace and then look for one or more whitespace characters again and then look for ::>".
 
Does that help/make sense?
 
-Matt
 
------------------------------------------
 
The 3 great virtues of a programmer:
Laziness, Impatience, and Hubris.
--Larry Wall
GeneralRegex++ causing abnormal program terminationmemberannum26 Mar '03 - 8:23 
Here is my code
 
CStdioFile myfile,newFile;
CString inString = "";
char* fString="";
 
//char *pattern="((\\s*lmp\\s*))";
char *pattern="\\s*(LMP\\([0-9]+\\)).*)\\n";
//***********The line below when reached //causes abnormal termination,dont know /whats the problem Please help
RegEx exp(pattern,TRUE);
std::string wholeFileStr="";
CString wholeFileString = "",filname="c:\\test23.txt",nName="c:\\MunnaMunna.txt";

 
// Read entire file into a string.
try{
myfile.Open(filname,CFile::modeRead | CFile::typeText, NULL);
newFile.Open(nName,CFile::modeWrite , NULL);
}
 
catch (CFileException e)
{
MessageBox("The file " + filname + " could not be opened for reading", "File Open Failed", MB_ICONHAND|MB_ICONSTOP|MB_ICONERROR );
//return FALSE;
//myfile.Close();
//newFile.Close();
}
try{


while (myfile.ReadString(inString))
{//newFile.WriteString(inString);
//wholeFileString += inString;
wholeFileStr = inString.GetBuffer(2);
//
//const char *wholeFileStr=(LPCTSTR)inString;//.GetBuffer(10);

// RegEx exp("(\\*sLMP((0-9)+\)).*)\n$",TRUE);
// RegEx exp("(\s*LMP\([0-9]+\))",TRUE);
// RegEx exp("\s*LMP",TRUE);
if(exp.Search(wholeFileStr))
{
//strcpy(fString,exp[1].c_str());
CString sd(exp[1].c_str());
AfxMessageBox(sd);
}
else if(fString!="")
{ CString temp(fString);
//CString temp2(wholeFileStr);
temp+=inString;
temp+="\n";
//strcat(fString,wholeFileStr);
//strcat(fString,"\n");
newFile.WriteString(temp);
}
 
}
//***************************************************************


}
catch (CFileException e)
{
MessageBox("The file " + filname + " could not be opened for reading", "File Open Failed", MB_ICONHAND|MB_ICONSTOP|MB_ICONERROR );
//return FALSE;
//myfile.Close();
//newFile.Close();
}

newFile.Close();
myfile.Close();
GeneralRe: Regex++ causing abnormal program terminationmemberperlmunger26 Mar '03 - 11:20 
Unfortunately, this problem could be any number of things. My gut feeling is that there is something wrong with your Regex++ build/install. What version of visual studio are you using? Did you do the Regex++ build for that version? (there are two make files--one for VC6 and one for VC7).
 
Beyond that, you're just going to have to debug it and see if you possiblly have code failing somewhere else.
 
I'm sorry I don't have a clearer answer for you.
 
Good luck.
 
-Matt
 
------------------------------------------
 
The 3 great virtues of a programmer:
Laziness, Impatience, and Hubris.
--Larry Wall

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130523.1 | Last Updated 17 Jun 2002
Article Copyright 2002 by perlmunger
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid