Click here to Skip to main content
15,886,058 members
Articles / Programming Languages / C#

Source Code Uncommentor in C#

Rate me:
Please Sign up or sign in to vote.
2.09/5 (6 votes)
15 Sep 2009CPOL4 min read 28.2K   1K   7   6
One of the first C# application to remove comments across multiple C-style languages (C, C++, Java and C#)

Introduction

Are you a developer who finds it easier to read code instead of comments, only to discover how difficult it is to analyse the code and how simple a function/procedure turns out to be? Perhaps you have been in a situation in which you wanted to re-document your code which contains 20-pages (or more) from scratch knowing how tedious it is to remove comments line-by-line. Or maybe you are pondering how to "water down" your code before transmitting it to reduce network transfer times.

What happens to comments in your code when you compile your program? This article provides an insight to these and a tool which strips existing comments within an ASCII source code file.

How it works

Ordinary compilers do not understand comments. It simply skips over them. However, having them in the plain text source will likely cause problems during the compilation process. Hence to overcome this situation, most C-style languages use /* */ or // to denote comments. This will flag to the compiler/intepreter not to "read" what comes after

int x;    //commentary about an unknown alien x, in one line

Or

int x;    /* Tell me more about the 
    stars and the moon, in an essay */

So ever wonder what what happens to those pesky comments when the moment you hit that compile button? They get dumped! Well, I mean they stay in the source file. Surprised? As mentioned earlier, comments are not for the compiler! Having said that, it may be possible to store comments in a binary's metadata section. (Although I don't know of a compiler that implements this functionality, at much performance trade off)

Take this code snippet in Java for example

Java
/**

* {@link Class#getSimpleName()} is not GWT compatible yet, so we

* provide our own implementation.

*/

@VisibleForTesting

static String simpleName(Class<?> clazz) {

    String name = clazz.getName();

    // we want the name of the inner class all by its lonesome

    int start = name.lastIndexOf('$');

    //if this isn't an inner class, just find the start of the

    // top level class name.

    if (start == -1) {

    start = name.lastIndexOf('.');

    }

    return name.substring(start + 1);

}

When you compile the code the compiler sees it as

Java
@VisibleForTesting

static String simpleName(Class<?> clazz) {

    String name = clazz.getName();

 

    int start = name.lastIndexOf('$');

 

    if (start == -1) {

        start = name.lastIndexOf('.');

    }

 

    return name.substring(start + 1);

}

as the comments are striped on the fly. That's all the compiler needs to generate object or/and machine code! The removal of comments is always done prior to compilation and it is very often transparent and invisible to developers.

The application I present here today implements this functionality. Given a text source file, it strips of comments, leaving compilable code behind. This come in handy when you wish to redocument someone else's or your code without having to manually remove the code line-by-line. In the example screenshot, all single line, multiline and even Javadoc comments are removed. Likewise, in C#, XML comments are also removed.

Algorithm Overview

The basic rule is when a single line(//) comment is found in the line of code, the program should stop reading until a new line is encounted ('\n'), the next line read.

When a multiline (/*) token is found, the program should stop processing until a */ is found. A "\n" or a "*/" returns the state to normal.

In this implementation, a StreamReader is employed to read our ASCII source code. As code is processes line-by-line using the readLine() method, detecting and handling comment delimeters becomes slightly more difficult. Many compiler implementations written in C such as gcc parses the code on a char-by-char implementation, for performance and optimisation. However what we are developing is nothing close to a full fledged compiler, so line-by-line processing should be adequate.

I was able to keep it to a minimum of 2 methods, the main method doUncomment() and an internal method to handle string literals. All methods are implemented as static, so there is no need to create instances. For more information, please refer to the Uncommenter class.

Using the code

To use the code insert the directive

C#
using UcommenterCS;

Then simply call the static function to do the work. For example

C#
Uncommenter.doUncomment("src.cpp");	//specify full path

That's it.

If you, however wish to run it standalone "out of the box" or simply like to try it out, I have included a compiled binary which is just as good. It is in the /bin folder. To use it, issue the following command

UcommenterCS <source.cpp/c/cs/java/h/js>

Parsing capabilities

Comments within string delimeters should be avoided. This application is able to correctly ensure comments are not part of a string! It ignores comments delimeters between " and " blocks.

Future functionality includes detecting and warning against unterminated block comments and string literals with the option of breaking execution should they be found.

Because I'm not a compiler linguist, I am not able to think of all the possible scenarios in which the code may fail. However, if you are up for a challenge, you are welcome to attempt to break my code. If that happens, please do let me know.

History

  • 1st version - 9th July
  • 2nd update (repackaged under different class name and namespace and changed to static methods)   - 14th September

I plan to write a Window Forms version in the not too distant future. Also in the works is a Java version.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
New Zealand New Zealand
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionRip and Tear Pin
Russell Mangel14-Jun-19 12:26
Russell Mangel14-Jun-19 12:26 
GeneralMy Vote ... (and I minor bug) Pin
Nick D Holt24-Sep-09 22:45
Nick D Holt24-Sep-09 22:45 
GeneralMy vote of 2 Pin
Yang Yu9-Sep-09 17:41
Yang Yu9-Sep-09 17:41 
GeneralRe: My vote of 2 [modified] Pin
Brendan Chong10-Sep-09 17:42
Brendan Chong10-Sep-09 17:42 
GeneralMy vote of 1 Pin
bechi9-Sep-09 17:40
bechi9-Sep-09 17:40 
GeneralMy vote of 2 Pin
jfriedman9-Sep-09 15:35
jfriedman9-Sep-09 15:35 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.