Find and Replace with Regular Expressions






4.14/5 (13 votes)
An article on using regular expressions to implement Find and Replace functionality

Introduction
This article demonstrates how to use regular expressions and RegEx
class (System.Text.RegularExpressions
namespace) to build the find and replace functionality found in most of the text editors and Word processors. The functionalities like whole word search or case sensitive/insensitive search can be implemented much easier using Regular Expressions than using any other methods. To demonstrate further, I included a wild card search. In wild card search (as you know), you can use *
to represent a group of characters and ?
to represent a single character. For example, if you enter a*
and check "Use wildcards" checkbox, all words starting with a
are selected. In addition to this, I have included a regular expression search, which helps you if you are a regular expression freak.
A word of caution though. This article is not at all a regular expression reference. The article and accompanying code use the regular expressions at a very basic level. If you want to learn about writing and understanding regular expressions, I would recommend An Introduction to Regular Expressions by Uwe Keim or The 30 Minute Regex Tutorial by Jim Hollenhorst.
Designing the Form
This project has only one form named FindAndReplaceForm
. No other forms or classes. So let's start by designing the form. The form contains a multi-line TextBox
, contentTextBox
, in which the text is searched. The text to be searched is entered in another TextBox
, searchTextBox
and the text to replace is entered in replaceTextBox
. Besides these textbox
es, the form contains four CheckBox
es and three buttons. The below table explains all controls in the form:
Name | Type | Property to Set | Property Value | Description |
contentSearchBox |
TextBox |
MultiLine |
True | |
Label1 |
Label |
Text |
Search: | This label is not used in any coding. Hence the default name is not changed |
Label2 |
Label |
Text |
Replace: | This label is not used in any coding. Hence the default name is not changed |
searchTextBox |
TextBox |
|||
replaceTextBox |
TextBox |
|||
matchWholeWordCheckBox |
CheckBox |
Text |
Match whole word | |
matchCaseCheckBox |
CheckBox |
Text |
Match case | |
useWildcardsCheckBox |
CheckBox |
Text |
Use Wildcards | |
useRegulatExpressionCheckBox |
CheckBox |
Text |
Use Regular Expressions | |
findButton |
Button |
Text |
Find | |
replaceButton |
Button |
Text |
Replace | |
replaceAllButton |
Button |
Text |
Replace All |
Once you complete the form design, it should look like the screen shot above.
Writing the Code
As you got an idea of controls placed on the form, you can have a look at the code. Let's start with the class level variables:
// Declare the regex and match as class level variables
// to make happen find next
private Regex regex;
private Match match;
// variable to indicate finding first time
// or is it a find next
private bool isFirstFind = true;
Now let's examine the code of the simplest (arguably, of course) functionality to understand - the Replace All. See code below:
// Click event handler of replaceAllButton
private void replaceAllButton_Click(object sender, EventArgs e)
{
Regex replaceRegex = GetRegExpression();
String replacedString;
// get the current SelectionStart
int selectedPos = contentTextBox.SelectionStart;
// get the replaced string
replacedString = replaceRegex.Replace
(contentTextBox.Text, replaceTextBox.Text);
// Is the text changed?
if (contentTextBox.Text != replacedString)
{
// then replace it
contentTextBox.Text = replacedString;
MessageBox.Show("Replacements are made. ", Application.ProductName,
MessageBoxButtons.OK, MessageBoxIcon.Information);
// restore the SelectionStart
contentTextBox.SelectionStart = selectedPos;
}
else // inform user if no replacements are made
{
MessageBox.Show(String.Format("Cannot find '{0}'. ",
searchTextBox.Text),
Application.ProductName, MessageBoxButtons.OK,
MessageBoxIcon.Information);
}
contentTextBox.Focus();
}
The GetRegExpression
function returns an instance of Regex
class, depending on text entered by the user in the form and checkbox
es selected. Once we get this instance, we can use Replace
method to make the replacements. Then our job is done.
Now let's examine the GetRegExpression
function. This function is called from most of the methods in this article:
// This function makes and returns a RegEx object
// depending on user input
private Regex GetRegExpression()
{
Regex result;
String regExString;
// Get what the user entered
regExString = searchTextBox.Text;
if (useRegulatExpressionCheckBox.Checked)
{
// If regular expressions checkbox is selected,
// our job is easy. Just do nothing
}
// wild cards checkbox checked
else if (useWildcardsCheckBox.Checked)
{
// multiple characters wildcard (*)
regExString = regExString.Replace("*", @"\w*");
// single character wildcard (?)
regExString = regExString.Replace("?", @"\w");
// if wild cards selected, find whole words only
regExString = String.Format("{0}{1}{0}", @"\b", regExString);
}
else
{
// replace escape characters
regExString = Regex.Escape(regExString);
}
// Is whole word check box checked?
if (matchWholeWordCheckBox.Checked)
{
regExString = String.Format("{0}{1}{0}", @"\b", regExString);
}
// Is match case checkbox checked or not?
if (matchCaseCheckBox.Checked)
{
result = new Regex(regExString);
}
else
{
result = new Regex(regExString, RegexOptions.IgnoreCase);
}
return result;
}
From the code listing above, it is clear that the GetRegExpression
function does most of the important jobs.
This is all that we need to do to implement the Replace All functionality. Now let's examine how the Find functionality is implemented.
// Click event handler of find button
private void findButton_Click(object sender, EventArgs e)
{
FindText();
}
// finds the text in searchTextBox in contentTextBox
private void FindText()
{
// Is this the first time find is called?
// Then make instances of RegEx and Match
if (isFirstFind)
{
regex = GetRegExpression();
match = regex.Match(contentTextBox.Text);
isFirstFind = false;
}
else
{
// match.NextMatch() is also ok, except in Replace
// In replace as text is changing, it is necessary to
// find again
//match = match.NextMatch();
match = regex.Match(contentTextBox.Text, match.Index + 1);
}
// found a match?
if (match.Success)
{
// then select it
contentTextBox.SelectionStart = match.Index;
contentTextBox.SelectionLength = match.Length;
}
else // didn't find? bad luck.
{
MessageBox.Show(String.Format("Cannot find '{0}'. ",
searchTextBox.Text),
Application.ProductName, MessageBoxButtons.OK,
MessageBoxIcon.Information);
isFirstFind = true;
}
}
From the click event handler of findButton
, the FindText
method is called. The FindText
is called from Replace also. That's why I made it a separate function instead of writing the code in the event handler itself.
Now the only functionality that remains to explore is Replace. Let's complete that too:
// Click event handler of replaceButton
private void replaceButton_Click(object sender, EventArgs e)
{
// Make a local RegEx and Match instances
Regex regexTemp = GetRegExpression();
Match matchTemp = regexTemp.Match(contentTextBox.SelectedText);
if (matchTemp.Success)
{
// check if it is an exact match
if (matchTemp.Value == contentTextBox.SelectedText)
{
contentTextBox.SelectedText = replaceTextBox.Text;
}
}
FindText();
}
So, before winding up the code listing, a small task is pending. What to do with the isFirstFind
variable? We declared this as a private
variable and checked its value in FindText
to see whether the user is pressing the Find button for the first time or not. Then we set its value to false
, if it is the first time so that the next find will be considered as find next. Again, we set its value to true
, if no match is found for a search. Is this enough? Definitely, no. The problem is how we can find that the user completed a search and when we can start from the beginning again? The method I followed is if the searchTextBox
or any of the checkbox
es is changed, it initializes a new search. This may not be the best approach, but hope it satisfies most of the users. See the code listing below:
// TextChanged event handler of searchTextBox
// Set isFirstFind to true, if text changes
private void searchTextBox_TextChanged(object sender, EventArgs e)
{
isFirstFind = true;
}
// CheckedChanged event handler of matchWholeWordCheckBox
// Set isFirstFind to true, if check box is checked or unchecked
private void matchWholeWordCheckBox_CheckedChanged(object sender, EventArgs e)
{
isFirstFind = true;
}
// CheckedChanged event handler of matchCaseCheckBox
// Set isFirstFind to true, if check box is checked or unchecked
private void matchCaseCheckBox_CheckedChanged(object sender, EventArgs e)
{
isFirstFind = true;
}
// CheckedChanged event handler of useWildcardsCheckBox
// Set isFirstFind to true, if check box is checked or unchecked
private void useWildcardsCheckBox_CheckedChanged(object sender, EventArgs e)
{
isFirstFind = true;
}
That's all about the code.
Conclusion
You can implement all these functionalities without using Regular Expressions. However, using regular expressions results in much simpler and maintainable code. This article explores only features of RegEx
class that are needed for Find and Replace functionality. So, a few important methods like Matches
or Split
are not covered. And as I mentioned earlier, this article can never be used as a reference to Regular Expressions.
History
- 1st April, 2007 - First version