15,309,875 members
Articles / Web Development / ASP.NET
Article
Posted 11 Apr 2008

38.4K views
16 bookmarked

# PowerShell Script for Reviewing Text Shown to Users

Rate me:
A script for extracting string literals from source code for review

It's embarrassing when users see text containing spelling and grammar errors. Error handling code is most likely to contain bad prose because developers are sloppy with code that isn't "supposed" to ever run. Imagine a snapshot of your dialog landing on your bosses desk. The message text reads "If you got here, your #$%@ed." How humiliating for your boss to see that you misspelled "you're". String tables are supposed to avoid this problem, but error handling strings often get left out. After all, why go to the effort to put a message string in a table when you know the code will never run? And while it's an exaggeration to say the code never runs, it may run so seldom that it is never seen in testing. The PowerShell script described in this article searches through a source code tree and extracts string literals that may be visible to users. It tries to filter out strings that are code from strings that are prose. The script isn't perfect; a complete solution would require a lot more work than a little script. But it does a good job of finding errors that would otherwise go undetected. You will probably find that your source code has far more typos than you thought. Other solutions to this problem have been proposed, such as spell checkers that run in the development environment. One advantage to the approach presented here is that the strings can be examined alone. The output could be given to an editor to review, someone without the desire to open thousands of source files. Another advantage is that the strings appear without context, just as the user sees them. We can forget that the user doesn't see the source code we were working on when we inserted a message box. If a message doesn't make sense in the text review report, it probably wouldn't make sense to a user either. Although the original intention of the script was to find spelling and grammar errors, the script is also a useful tool for code reviews. If a project has a large amount of redundant code from "clipboard inheritance" this will show up in the text review, particularly if the repetitive code contains distinct typos. The script also makes it evident if a project is constructing HTML, SQL, or JavaScript by string concatenation. ## Using the Code The script takes one optional argument, the path of the root of the directory to search. If no argument is provided, the script explores the current working directory. The source code directory is searched recursively. The script contains a list of file extensions to specify which kinds of files to search. The script writes to the command line, and so you will usually want to pipe the output to a text file. PS C:\> .\TextReview.ps1 <a href=""file:///C:/foo/bar"">C:\foo\bar</a> > out.txt The output will list each file name followed by the string literals that are not filtered out, each with its line number. Files not containing strings are omitted. You may want to examine the file out.txt in Microsoft Word to run its spelling and grammar check on the output. The script is configured to search C++, C#, VB, ASP.NET, JavaScript, and XML files. You may want to modify this line to change the file extensions you want to search. $sourceExtensions = "\.(cs|vb|aspx|resx|cpp|rc|h|js|xml)\$"

You may also want to modify some of the regular expressions used to filter out strings that appear to be source code rather than text intended for human readers.

If this is your first PowerShell script to run, you will need to set your execution policy to allow scripts to run on your computer.

## Points of Interest

This script began as a Perl script used for extracting strings from MFC code. I've since rewritten it as PowerShell and now use it mostly on ASP.NET and WinForms projects. The file extensions and pattern filtering have had to evolve as the script has been used with new languages.

## History

• 11th April, 2008: Initial post

## Share

 President John D. Cook Consulting United States
I work in the areas of applied mathematics, data analysis, and data privacy.

Check out my blog or send me a note.