I present to you a tool that is capable of... Counting Lines!
Wow! you say in amazement as you stagger back trying to regain your balance. That's right my friends, I am afraid there is no ground breaking stuff here today.
However having said that, I had to create this tool because I could not easily find anything else out there that would do what I was after. Thus if you bear with me, you might find this code useful to you.
There are two parts to this article you may find interesting. The first is the
DirectoryLineCounter. This is the heart of the article and is a simple class that will recursively extract the number of lines from a subset of files from a given directory.
The second interesting part of the download is the application that uses the
DirectoryLineCounter. This application allowed us to rapidly work out exactly just how much code was contained in the various sections of our repository.
This information was useful to us in identifying where people were creating the most code in our scientific framework. We were hoping to see that the most code effort was being put into the creation of science, but instead we saw that the applications (GUI's) that were utilizing the science framework were where the most lines of code were being recorded.
Line count engine
We were initially using a simple
linecounter (grep/script) to give us the total number of lines in our entire repository, but this didn't really give us any useful information as to what areas of the repository contained the most code. Thus we created this simple class that was capable of recursing into the directories and reporting back the information in a structured way.
DirectoryLineCounter class has two static arrays,
FileSearchPatterns. These are the directories to ignore (i.e. bin, debug, .cvs, .svn ...), and the file types to count (i.e.. *.cs, *.h, *.vb ...). Having them as static fields was fine for our use because
DirectoryLineCounter was only ever run with the intention of summarizing one directory (and its subdirectories) in the one run. It would be a simple change to make the member fields, and pass them as parameters to the recursive runs.
FileSearchPatterns fields have been set, the
DirectoryLineCounter is able to produce some useful results by calling the
countLines() method. Once complete the
DirectoryLineCounter will contain two counts, one for the lines of code found in the directory it was pointing at (
DirectoryLines), and another count for the total lines found in all subdirectories (
DirectoryLineCounter also contains a list of
DirectoryLineCounter's that represent all the subdirectories in the initial directory, this array is the
The last thing that might need explaining is the
FilesCompleted event. This event is fired whenever a
DirectoryLineCounter has finished counting from all the files in its directory. It then passes back in the event the number of files just completed. This was useful for giving the user progress of where the process was at.
DirectoryLineCounter was put into a separate project (LineCountEngine) so that it was easy to create many applications from the same project. I intended to write a command line application that would also utilize the LineCountEngine, but this is not likely to happen given the current time constraints.
As for performance, I have no idea what is good, but I can tell you it takes about 10 seconds to summarize our repository of around 650,000 lines. This is not a problem for us.
Line counter application
The line counter application simply made use of the LineCountEngine. You tell it where to start and press go, the application then builds a directory tree with the information the
Thanks to Julijan Sribar for the use of his
PieChart component. You can see his article here.