Click here to Skip to main content
Licence CPOL
First Posted 22 Jul 2003
Views 363,536
Downloads 6,295
Bookmarked 217 times

Notepad RE (Regular Expressions)

By | 22 Mar 2011 | Article
Search and replace text in Notepad RE using Regular Expressions or normal mode. The editor supports drag and drop, file change notifications, and displays the line and column numbers. Unicode support is available too.
Screenshot - notepadre.png

Introduction

This is a simple Notepad replacement. The main feature is that you can Search and Replace optionally using regular expressions. The boost::regex library is used for regex support. Note that the intention is for the boost::regex library to eventually become part of the C++ Standard Library. Replace All is improved compared to normal Notepad, as it builds a new text file in memory and replaces the entire text at once when it has finished. This is much quicker than replacing every match in the edit window as you go along.

As the development of Notepad RE progresses, more sophisticated features are being added.

Features

  • Find and Replace using regex - notepadreDoc.cpp
  • Find and Replace in normal mode - notepadreDoc.cpp
  • GREP/Find in Files capability - FindInFilesDlg.cpp
  • Multiple Undo/Redo - notepadreView.cpp
  • Dockable Find and Replace dialogs - MainFrm.cpp
  • Find will wrap from bottom to top -- or top to bottom, depending on search direction -- if necessary - notepadreDoc.cpp
  • If the file you are editing is changed by another process, you have the option of being asked if you want to reload - notepadreDoc.cpp
  • Line and column displayed in status bar - MainFrm.cpp, called from notepadre.cpp
  • You can drop files as path/filename from Explorer by clearing Options->Drop Files - MainFrm.cpp
  • You can drag and drop text to and from the edit window to and from other applications that support drag and drop - notepadreView.cpp
  • You can re-open an existing file, something that does not work in the standard CEditView class - notepadre.cpp
  • You can open a text file bigger than 1 MB - notepadreView.cpp
  • Unicode is supported - notepadreFile.cpp
  • You can open and re-save UNIX text files correctly - notepadreDoc.cpp
  • The Find/Replace dialog is written from scratch - FindReplaceDlg.cp
  • Help file included - MainFrm.cpp

What are Regular Expressions?

A Regular Expression is simply some text. I think it is safe to assume that anyone who has used a modern computer will have used Find and/or Replace dialogs in more than one application that allows text processing, whether it is Notepad, a word processing program, or a web browser. At the simplest level, a regular expression is no different to the text you type into the edit field of a Find dialog. Where regular expressions differ to normal text is that they give special meaning to certain characters, allowing you to specify textual 'patterns' rather than just literal text. The special characters are the following:

'.', '|', '*', '?', '+', '(', ')', '{', '}', '[', ']', '^', '$' and '\'.

These characters are often known as 'metacharacters' in the jargon of regular expressions. If you have ever typed something like...

*.txt

... or something similar, then you are already familiar with the concept of characters having special meaning in a piece (string) of text. Wildcards -- i.e. the characters '*' and '?' -- used when negotiating most computer file systems are a massively simplified version of regular expressions. As well as being able to match any character ('?') or any string ('*'), regular expressions allow you to specify ranges of characters that can match, repeating textual patterns, alternative matching patterns and even matching positions within text. Note that in the syntax of regular expressions, the wildcard character '?' becomes '.' and '*' becomes '.*'.

If you have never used regular expressions before, then once you have learned the syntax you are in for a pleasant surprise. Once you have mastered their use, you will never look back! The official reference for the boost regular expression library is here. See this Regular Expression Primer for a very basic description. The book Mastering Regular Expressions is very good for when you really want to get in-depth!

Getting the Boost library

Visit Boost.org to obtain the boost regular expressions library.

Building the Boost Library

These instructions are for building under Visual C++ version 6.0

  • Download the ZIP file
  • Unzip the contents to C:\
  • From a command prompt:
    • C:\>"C:\Program Files\Microsoft Visual Studio\VC98\Bin\vcvars32.bat"
    • Ensure the environment variable 'include' includes the path "C:\Program Files\Microsoft Visual Studio\VC98\include"
    • Ensure environment variable 'lib' includes the path "C:\Program Files\Microsoft Visual Studio\VC98\lib"
    • C:\>cd C:\boost_1_39_0\libs\regex\build
    • C:\boost_1_39_0\libs\regex\build\>nmake /f vc6.mak
  • Wait until the build finishes (you might want to get a coffee..!)
  • Add C:\boost_1_39_0 to your includes (Tools, Options, Directories, Include files from the VC menu)
  • Add C:\boost_1_39_0\libs\regex\build\vc6 to your library path (Tools, Options, Directories, Library Files from the VC menu)

Getting the Microsoft HTML Help Workshop

Get it here.

Installing HTML Help

  • Download htmlhelp.exe from the link above
  • Run htmlhelp.exe, installing to the default directory C:\Program Files\HTML Help Workshop
  • Add C:\Program Files\HTML Help Workshop\include to your includes (Tools, Options, Directories, Include files from the VC menu)
  • Add C:\Program Files\HTML Help Workshop\lib to your library path (Tools, Options, Directories, Library Files from the VC menu)

Program Design

File Handling

Notepad RE supports ANSI, Unicode, Big Endian Unicode and UTF-8 file formats. Additionally, Windows, UNIX and Macintosh line endings are supported, including files with inconsistent line endings. The file handling routines are the most tricky parts of Notepad RE.

Regular Expression Syntax

The regular expression syntax is now selectable under the Options menu.

Matching, Including Over More Than One Line

I've aimed to provide default search functionality with the maximum amount of possibilities and the minimum amount of surprises. The basic aim is to provide functionality based on vi, but with several improvements.

  • 'Char Classes' are supported (i.e. [[:CLASS:]] syntax is allowed)
  • 'Intervals' are supported (i.e. {x,y} syntax allowed)
  • 'Back References' are supported (i.e. \1, \2 etc. are allowed)
  • 'Escape in Lists' is supported (i.e. the \ character is the escape character inside [...])
  • + is supported (of course)
  • ? is supported (of course)
  • | is supported (of course)
  • Use Perl-like variables $1, $2, $3 etc. in the Replace field to use captured text
  • .* matches characters on the current line, like vi. To continue a match to the next line, follow .* with \r\n
  • Note that characters \r and \n are treated as whitespace. For example, if you use \s+ as part of your regex, you may be surprised to find you have matched text across lines
  • $ works like it does in vi, but may also be followed by \r\n if you want to match the 'newline' character

References

  • "The C++ Programming Language Special Edition" by Bjarne Stroustrup
  • "Advanced Windows" by Jeffrey Richter, Microsoft Press
  • "The Essence of COM with ActiveX, a Programmer's Workbook" by David S. Platt, Prentice Hall
  • "Mastering Regular Expressions" by Jeffrey E. F. Friedl, O'Reilly
  • "Professional MFC with Visual C++ 6" by Mike Blaszczak, Wrox Press Inc

Future Work

  • Popup menu in Replace dialog for regex replace syntax
  • Investigate syntax highlighting
  • Use the std::tr1 interface to boost::regex
  • Use MicrosoftMS Unicode routines when loading/saving
  • HEX view

This MFC version of Notepad RE will be improved until it is as close to Windows Notepad as possible. After that, I may rewrite it as a WTL program.

History

  • 23 July, 2003
    • Original version posted
  • 2 June, 2007: Version 1.1.0.1
    • Multiple Undo/Redo added
  • 4 June, 2007: Version 1.1.0.2
    • BUG FIX: Replace with empty string works again!
    • Group characters for undo
    • Undoing all changes sets modified flag to FALSE
    • Replacing a selection now treated as an atomic undo/redo
  • 10 June, 2007: Version 1.1.0.3
    • BUG FIX: Clear Undo history when toggling word wrap
  • 12 June, 2007: Version 1.1.0.4
    • BUG FIX: Forgot to add the OnKeyUp function!
  • 14 June, 2007: Version 1.1.0.5
  • 16 June, 2007: Version 1.1.0.6
  • 17 June, 2007: Version 1.1.0.7
    • Added first cut of Find in Files
  • 21 June, 2007: Version 1.1.0.8
    • If Modified flag set before toggling word wrap -- therefore flushing the undo buffer -- don't set to false if subsequently all edits are undone!
    • Various tweaks to Find in Files
  • 27 June, 2007: Version 1.1.0.9
    • BUG FIX: A sequence of replacements is no longer treated as one big transaction by Undo
  • 3 July, 2007: Updated help file
  • 4 July, 2007: Version 1.1.1.0
    • Added popup menu to Find and Replace dialogs for regex syntax
  • 6 July, 2007: Version 1.1.1.1
    • BUG FIX: A sequence of replacements is no longer treated as one big transaction by Redo
    • Finished popup menu in Find and Replace dialogs for regex syntax
  • 10 July, 2007: Version 1.1.1.2
    • Find in Files now sends output to a dockable toolbar
    • Changed tab order in Replace dialog
    • Changed 'Number' regex to be PERL mode friendly
  • 7 August, 2007: Version 1.1.1.3
    • BUG FIX: Shift-Del only creates one Undo entry now!
  • 8 August, 2007: Version 1.1.1.4
    • BUG FIX: Ctrl-C works again...
  • 10 October, 2007: Version 1.1.1.5
    • BUG FIX: Saving with word wrap enabled no longer saves too much text
    • Find in Files now runs in the background
  • 26 June, 2008: Version 1.1.1.6
    • BUG FIX: Check for Non-Windows line endings in CNotepadreFile::CountCharsUTF8() fixed
    • Help file correction (thanks har0ld)
    • Fixes to CRegexSyntaxDlg
  • 17 March, 2009: Version 1.1.1.7
    • Selected text copied to Find and Replace dialogs
  • 23 March, 2009: Version 1.1.1.8
    • Re-enabled ".LOG" support
  • 24 March, 2009: Version 1.1.1.9
    • Find in Files now supports multi-line matching
  • 26 March, 2009: Version 1.1.2.0
    • Sped up Find in Files (make sure you use \r\n for multi-line matching)
  • 27 March, 2009: Version 1.1.2.1
    • Fixed memory leak in PeformGrep()
    • Selected text copied to Find in Files dialog
    • Find in Files now opens with CFile::modeRead | CFile::shareDenyNone
  • 27 March, 2009: Version 1.1.2.2
    • PerformGrep() wasn't counting newlines from the beginning of the file!
  • 29 March, 2009: Version 1.1.2.3
    • More improvements to Find in Files (more responsive, displays progress, etc.)
  • 7 April, 2009: Version 1.1.2.4
    • Double clicking the results from Find in Files now goes to correct line even with word wrap enabled
  • 9 April, 2009: Version 1.1.2.5
    • Ensure only one line is shown per match in the Find in Files results
    • Enable checkbox for Whole Word Only for regex mode (This is a convenience feature. All that happens is that the regex is wrapped in \b(?:)\b)
  • 16 April, 2009: Version 1.1.2.6
    • Changed regex whole word only syntax depending on regex flavour. Still not perfect as some flavours do not support this feature at all, but at least most work correctly now.
  • 8 January, 2011
    • Updated zip file
  • 19 March, 2011: Version 1.1.2.8
    • Added support for loading and saving toolbar positions
  • 21 March, 2011: Version 1.1.2.9 
    • Uses SetWindowPlacement() as it is more accurate than MoveWindow()

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Ben Hanson

Software Developer (Senior)

United Kingdom United Kingdom

Member

I started programming in 1983 using Sinclair BASIC, then moving on to Z80 machine code and assembler. In 1988 I programmed 68000 assembler on the ATARI ST and it was 1990 when I started my degree in Computing Systems where I learnt Pascal, C and C++ as well as various academic programming languages (ML, LISP etc.)
 
I have been developing commercial software for Windows using C++ for 15 years.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board. (secure sign-in)
 
Search this forum  
 FAQ
    Noise  Layout  Per page   
  Refresh
GeneralUnexpected results Pinmembermbue3:24 21 Jan '11  
GeneralRe: Unexpected results PinmemberBen Hanson5:01 21 Jan '11  
GeneralRe: Unexpected results Pinmembermbue8:03 21 Jan '11  
GeneralRe: Unexpected results PinmemberBen Hanson0:38 19 Mar '11  
QuestionMultiple AND / OR in single search? PinmemberMick Leong22:46 2 Jun '09  
AnswerRe: Multiple AND / OR in single search? PinmemberBen Hanson7:08 3 Jun '09  
GeneralRe: Multiple AND / OR in single search? Pinmemberzikl10:13 8 Jul '09  
GeneralRe: Multiple AND / OR in single search? PinmemberBen Hanson10:38 5 Aug '09  
GeneralPCRE PinmemberBen Hanson4:20 6 May '09  
GeneralRe: PCRE Pinmemberprantlf7:28 10 Jan '11  
QuestionUnicode PinmemberBen Hanson21:56 1 Apr '09  
QuestionWhich version of boost::regex is required? Pinmembers_dimi8:33 27 Mar '09  
AnswerRe: Which version of boost::regex is required? PinmemberBen Hanson6:23 29 Mar '09  
QuestionVC6 Poll PinmemberBen Hanson22:04 21 Oct '08  
AnswerRe: VC6 Poll PinmemberAlexandre GRANVAUD0:21 18 Mar '09  
QuestionHow many of you use tr1? PinmemberBen Hanson22:01 21 Oct '08  
AnswerRe: How many of you use tr1? PinmemberBen Hanson6:47 24 Mar '09  
GeneralHelp !!! PinmemberSwapnil96323:46 20 Oct '08  
AnswerRe: Help !!! PinmemberBen Hanson21:59 21 Oct '08  
Generalthanks and maybe error in chm help file Pinmemberhar0ld1:00 2 Mar '08  
GeneralRe: thanks and maybe error in chm help file PinmemberBen Hanson2:12 10 Mar '08  
NewsThere's a new version of the RegEx Tester Tool ! PinmemberBucanerO_Slacker23:16 1 Mar '08  
QuestionWhat the heck??? This is exactly what I need! PinmemberDaniel Cohen Gindi9:43 15 Oct '07  
Hi!
 
Thanks for this! This notepad, is what I've been looking for, for about 10 years...
When I was a child I knew an old guy who was working on an advanced editor which does the same (much more than this one, but basically the same), it was for DOS, and it was like the project of his life... It was capable of finding and replacing so complex expressions that I didnt know who the heck will ever need it. But it was his personal project, and as far as I know, he never shared it, or even published as software/shareware...
 
Your Notepad RE is going to save me soooo much headache!
 
Thanks again...
Daniel
 
-----
Daniel Cohen Gindi
danielgindi (at) gmail dot com

AnswerRe: What the heck??? This is exactly what I need! PinmemberBen Hanson10:05 16 Oct '07  
GeneralTab PinmemberChris U7:28 9 Jul '07  

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Mobile
Web03 | 2.5.120529.1 | Last Updated 22 Mar 2011
Article Copyright 2003 by Ben Hanson
Everything else Copyright © CodeProject, 1999-2012
Terms of Use
Layout: fixed | fluid