![]() |
General Programming »
Algorithms & Recipes »
Parsers
Intermediate
License: The Code Project Open License (CPOL)
Notepad RE (Regular Expressions)By Ben HansonSearch and replace text in Notepad RE using Regular Expressions or normal mode. The editor supports drag and drop, file change notifications, and displays the line and column numbers. Unicode support is available too. |
VC6Win2K, WinXP, MFC, Dev
|
|
Advanced Search Add to IE Search |
|
|
|
||||||||||||||||
This is a simple Notepad replacement. The main feature is that you can Search and Replace optionally using regular expressions. The boost::regex library is used for regex support. Note that the intention is for the boost::regex library to eventually become part of the C++ Standard Library. Replace All is improved compared to normal Notepad, as it builds a new text file in memory and replaces the entire text at once when it has finished. This is much quicker than replacing every match in the edit window as you go along.
As the development of Notepad RE progresses, more sophisticated features are being added.
CEditView class - notepadre.cpp A Regular Expression is simply some text. I think it is safe to assume that anyone who has used a modern computer will have used Find and/or Replace dialogs in more than one application that allows text processing, whether it is Notepad, a word processing program, or a web browser. At the simplest level, a regular expression is no different to the text you type into the edit field of a Find dialog. Where regular expressions differ to normal text is that they give special meaning to certain characters, allowing you to specify textual 'patterns' rather than just literal text. The special characters are the following:
'.', '|', '*', '?', '+', '(', ')', '{', '}', '[', ']', '^', '$' and '\'.
These characters are often known as 'metacharacters' in the jargon of regular expressions. If you have ever typed something like...
*.txt
... or something similar, then you are already familiar with the concept of characters having special meaning in a piece (string) of text. Wildcards -- i.e. the characters '*' and '?' -- used when negotiating most computer file systems are a massively simplified version of regular expressions. As well as being able to match any character ('?') or any string ('*'), regular expressions allow you to specify ranges of characters that can match, repeating textual patterns, alternative matching patterns and even matching positions within text. Note that in the syntax of regular expressions, the wildcard character '?' becomes '.' and '*' becomes '.*'.
If you have never used regular expressions before, then once you have learned the syntax you are in for a pleasant surprise. Once you have mastered their use, you will never look back! The official reference for the boost regular expression library is here. See this Regular Expression Primer for a very basic description. The book Mastering Regular Expressions is very good for when you really want to get in-depth!
Visit Boost.org to obtain the boost regular expressions library.
These instructions are for building under Visual C++ version 6.0
Get it here.
Notepad RE supports ANSI, Unicode, Big Endian Unicode and UTF-8 file formats. Additionally, Windows, UNIX and Macintosh line endings are supported, including files with inconsistent line endings. The file handling routines are the most tricky parts of Notepad RE.
The regular expression syntax is now selectable under the Options menu.
I've aimed to provide default search functionality with the maximum amount of possibilities and the minimum amount of surprises. The basic aim is to provide functionality based on vi, but with several improvements.
[[:CLASS:]] syntax is allowed) {x,y} syntax allowed) \1, \2 etc. are allowed) \ character is the escape character inside [...]) + is supported (of course) ? is supported (of course) | is supported (of course) $1, $2, $3 etc. in the Replace field to use captured text .* matches characters on the current line, like vi. To continue a match to the next line, follow .* with \r\n \r and \n are treated as whitespace. For example, if you use \s+ as part of your regex, you may be surprised to find you have matched text across lines $ works like it does in vi, but may also be followed by \r\n if you want to match the 'newline' character std::tr1 interface to boost::regex This MFC version of Notepad RE will be improved until it is as close to Windows Notepad as possible. After that, I may rewrite it as a WTL program.
FALSE OnKeyUp function! false if subsequently all edits are undone! CNotepadreFile::CountCharsUTF8() fixedCRegexSyntaxDlgPeformGrep()CFile::modeRead | CFile::shareDenyNonePerformGrep() wasn't counting newlines from the beginning of the file!\b(?:)\b)
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 16 Apr 2009 Editor: Deeksha Shenoy |
Copyright 2003 by Ben Hanson Everything else Copyright © CodeProject, 1999-2009 Web10 | Advertise on the Code Project |