Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / .NET

ParsedCommandLine

4.91/5 (8 votes)
15 Mar 2013CPOL11 min read 29.7K   225  
An enumeration-based command line parser.

Background

I write a lot of console applications so parsing command lines is very important to me.

From Wikipedia[^]:

Arguments
A command-line argument or parameter is an argument sent to a program being called.
In principle a program can take many command-line arguments, the meaning and importance of which depend entirely upon the program.

Command-line option
A command-line option or simply option (also known as a flag or switch) modifies the operation of a command; the effect is determined by the command's program.

In this article I'll use the terms "parameter" and "switch" for these concepts. If the application only uses parameters, then splitting them up is so easy that the system will do that for you (the args parameter to the Main method). But difficulty arises for applications that support switches.

For example, if a user types command / switchname = "the value" parameter the system will split the four parts of the switch apart rather than keeping them together. If you try to concatenate them back together you'll still have trouble. For this reason, when writing a command line parser, it's better to start fresh with the command line in an unprocessed state. I'll talk a little more about this later. Of course, you could insist that users not add whitespace where it doesn't belong, but it's easy enough to allow it, so I'd rather do so. And there's some precedence for this as well; I just tried DIR commands on Windows and OpenVMS; they both allowed whitespace after the slash and OpenVMS allowed whitespace around the equals sign.

There are many command line parser articles available on this site. The following is just a sample that a simple search found:
Commander - A Command Parser[^]
C#/.NET Command Line Argument Parser Reloaded[^]
C# command line parsing[^]
Advanced command line parser class for .NET[^]
C#/.NET Command Line Arguments Parser[^]
Intelligent Command Line Parser[^]
Simple Command Line Parser[^]
CCmdLine - A command line parser[^]
Lightweight C# Command Line Parser[^]
Powerful and simple command line parsing in C#[^]
Automatic Command Line Parsing in C#[^]
Yet Another Command Line Parser[^]
Command line parser[^]

I haven't read all of those, and only glanced at some, but one of the conclusions one may make is that although many people want a command line parser, there is no concensus on what a command line parser should and shouldn't do. There are also what might be considered "cultures" of command line parsers -- the primary differences being in how switches are named and introduced and how their values are specified.

As with most things, I prefer to roll my own rather than use code someone else wrote, particularly when the available code doesn't do exactly what I want. If nothing else, it's good exercise, and I recommend it to you as well. Unfortunately, my quest for my own command line parser has been long and frustrating because every time I began thinking about it I kept coming up with more and more features and it just became impossible. What I finally had to do was define what I needed now and not think about situations I might have in the future.

Introduction

The code I'm presenting is my implementation of a simple command line parser that I use in a few different applications. It supports a subset of what I am used to in OpenVMS.

Major features:

  • Honor quotes when tokenizing
  • Parameters should be returned in an IList<string>
  • Switches are introduced by a slash (/)
  • Case sensitivity of switch names should be optional
  • Switch names may be abbreviated as long as they remain unique
  • Values for switches are introduced by an equals sign (=)
  • The value of a switch may be a list of values within parentheses (())
  • Switches should be returned in a Dictionary
  • If a switch is repeated, only the last (right-most) appearance will be used

Features I opted to leave out:

  • Assignment of names to parameters
  • Conversion of values from string to other types
  • Range checking
  • Indicating which parameters and switches are required
  • Conversion of comma-separated values to lists

I feel that these features are better left to the application to provide and enforce.

Ordinarily the value of a switch will default to NULL, but another default value may be specified.

Additionally, I need it to work in more than the usual situation of parsing the command line from when the application is run; it needs to be able to parse commands of interactive applications (applications like ftp).

I also added the rudimentary ability for the class to enumerate the supported switches with a description and default value (if specified).

Using the Code

As is my wont, I opted to use enumerations to define the names for switches and I made the class generic on enumerations. Unfortunately this means that switch name must be valid identifiers, so they can't begin with a digit, but that shouldn't be a major limitation. The included demo program uses the following enumeration.

[PIEBALD.Attributes.InvariantCultureIgnoreCaseAttribute]
private enum FizzBuzz
{
  [System.ComponentModel.DescriptionAttribute("Display Fizz this often")
  ,System.ComponentModel.DefaultValueAttribute("3")]
  Fizz
,
  [System.ComponentModel.DescriptionAttribute("Display Buzz this often")
  ,System.ComponentModel.DefaultValueAttribute("5")]
  Buzz
}

The attributes are not required; I included them to show these optional features of the parser. In this example, the InvariantCultureIgnoreCaseAttribute instructs the parser to be case insensitive when parsing switch names.

Once you have defined the enumeration you can use it to instantiate a parser and parse a command line.

PIEBALD.Types.IParsedCommandLine<FizzBuzz> args =
new PIEBALD.Types.ParsedCommandLine<FizzBuzz>
(
  PIEBALD.Lib.LibSys.CommandLineWithoutExecutableName
,
  PIEBALD.Types.ParsedCommandLine.Options.ExpandEnvironmentVariables
) ;

There are other overloads of the constructor, and in many cases no parameters are required.

Once the string is successfully parsed, the parameters will be in the Parameter IList<string> and the switches will be in the Switch Dictionary<T,string>.

Access to the parameters and switches is then accomplished by use of indexing on the exposed collections:

args.Parameter [ 0 ]
args.Switch [ FizzBuzz.Fizz ]

I also wrote indexers that allow the following usage, but I suspect I'll receive comments about them breaking some sort of "rule", so don't use them.

args [ 0 ]
args [ FizzBuzz.Fizz ]

Every switch will be present so you don't need to check. The value of a switch will default to null if the user didn't include that switch in the command and the enumeration doesn't specify some other default.

Acknowledgement

The provided code does contain some code that was written by someone else. In ReadOnlyDictionary.cs you will find code which I found at
http://stackoverflow.com/questions/678379/is-there-a-read-only-generic-dictionary-available-in-net[^]
All I did was put it in my own namespace and altered the formatting a little; I couldn't improve on it.

This class is then used by the following Extension Method to return a read-only interface to the Dictionary containing the Switches.

public static System.Collections.Generic.IDictionary<TKey, TValue>
AsReadOnly<TKey, TValue>
(
  this System.Collections.Generic.IDictionary<TKey, TValue> Fish
)
{
  return ( new PIEBALD.Types.ReadOnlyDictionary<TKey, TValue> ( Fish ) ) ;
}

A number of other things to cover before I get into the main code

First off, I'll simply list out my other articles and tips that this code relies upon:
Rive[^]
A String.StartsWith that uses a StringComparer[^]
Dictionary<string,T>.BestMatch[^]

The Tips have alternatives that you may want to peruse as well.

Interfaces

I defined two interfaces: one that supports parameters only and one that supports both parameters and switches

public interface IParsedCommandLine : System.IDisposable
{
  int Count { get ; }

  System.Collections.Generic.IList<string> Parameter { get ; }
  string this [ int Index ] { get ; }
}

public interface IParsedCommandLine<T> : IParsedCommandLine
{
  new string this [ int Index ] { get ; } /* Just a work-around */

  System.Collections.Generic.IDictionary<T,string> Switch { get ; }
  string this [ T Key ] { get ; }
}

EmptyEnum

EmptyEnum is an enumeration with no members that can be used when your application doesn't use switches:

public enum EmptyEnum {}

Options

The following options may be provided to the constructors. These options affect the values of switches and parameters.
The default is None.
If ExpandEnvironmentVariables is specified then each value will have System.Environment.ExpandEnvironmentVariables.aspx[^] called on it (I use environment variables a lot).
If ToUpper is specified, then the values will be uppercased.

[System.FlagsAttribute()]
public enum Options
{
  None = 0
,
  ExpandEnvironmentVariables = 1
,
  ToUpper = 2
}

CommandLineWithoutExecutableName

As mentioned earlier, the system can correctly parse simple command lines (those with no switches) but that trying to parse switches from the args parameter to Main will lead to other problems unless you can be sure that the user adheres to some rules (don't allow SPACEs between tokens of a switch). So what we want to do is start with an unsullied copy of the command line. In .net this can be retrieved with the System.Environment.CommandLine method. However, unlike the args parameter, the full command line begins with the name of the application and in most situations we want to remove it. (Aside -- classic C includes the application name in the args parameter, but this is not the case in C# and .net )

So what I do to remove the application name is rive the command line after the application name and keep the rest.

public static string
CommandLineWithoutExecutableName
{
  get
  {
    System.Collections.Generic.IList<string> cmd =
      System.Environment.CommandLine.Rive ( 2 , Option.HonorQuotes | Option.HonorEscapes ) ;

    return ( cmd.Count > 1 ? cmd [ 1 ] : System.String.Empty ) ;
  }
}

StringComparerAttribute

The provided enumeration may be decorated with a StringComparerAttribute to indicate how to compare switch names for equality. There is an abstract class and a derived class for each member of System.StringComparer[^] .

[System.AttributeUsageAttribute(System.AttributeTargets.Enum, Inherited = true, AllowMultiple = false)]
public abstract class StringComparerAttribute : System.Attribute
{
  public System.Collections.Generic.IEqualityComparer<string> Comparer { get ; private set ; }

  protected StringComparerAttribute
  (
    System.Collections.Generic.IEqualityComparer<string> Comparer
  )
  {
    this.Comparer = Comparer ;

    return ;
  }
}

public sealed class InvariantCultureIgnoreCaseAttribute : StringComparerAttribute
{
  public InvariantCultureIgnoreCaseAttribute
  (
  )
  : base
  (
    System.StringComparer.InvariantCultureIgnoreCase
  )
  {
    return ;
  }
}

SwitchDefinition

The members of the specified enumeration will be stored in a Dictionary<string,SwitchDefinition> to ease the process of matching tokens to switches, assigning default values, and providing help text if requested.

private sealed class SwitchDefinition
{
  public T      Identifier { get ; private set ; }
  public string HelpText   { get ; private set ; }
  public string Default    { get ; private set ; }

  public SwitchDefinition
  (
    T      Identifier
  ,
    string HelpText
  ,
    string Default
  )
  {
    this.Identifier = Identifier ;
    this.HelpText   = HelpText   ;
    this.Default    = Default    ;

    return ;
  }
}

This may be a good place remind you that this class only deals with string values and to point out that the default value stored here and assigned to switches is a string. So if you provide a non-string to the DefaultValueAttribute it will have had ToString called upon it. Generally, you are better off using only strings to specify default values or don't specify a default value and test for null.

ParsedCommandLine<T>

Static members

The regular Expression will be used to tokenize the command line. The Dictionary will hold the list of supported switches.

public partial class ParsedCommandLine<T> : IParsedCommandLine<T>
{
  private static readonly System.Text.RegularExpressions.Regex reg =
  new System.Text.RegularExpressions.Regex
  (
    @"(^|\G)\s*" +
    @"(?:(?:(?:/\s*(?'Name'\w+)(?:\s*(?'HasValue'=)\s*(?:(?:\((?'Value'[^)]*?)\))" +
    @"|(?:""(?'Value'[^""]*?)(?:""|$))|(?'Value'.+?(?=$|\s|/))))?)" +
    @"|(?:(?:""(?'Value'[^&</big>quot;"]*?)(?:""|$))|(?'Value'\S+))))"
  ) ;

  private static readonly System.Collections.Generic.Dictionary<string,SwitchDefinition> map ;

  ...
}

I hightlighted two parts of the Regular Expression to show the characters that introduce a switch and its value. You can change those if you like, but you'd better test the results. For help in understanding and testing Regular Expressions I recommend:
Regular Expression Language - Quick Reference [^]
RegexTester[^]

The static constructor handles validating and interpreting the provided enumeration. First we have to ensure that the provided type is an enumeration. Then we can query the attributes on the type for a StringComparerAttribute; if one is present we can pass it to the constructor of the Dictionary.

static ParsedCommandLine
(
)
{
  System.Type typ = typeof(T) ;

  if ( !typ.IsEnum )
  {
    throw ( new System.Exception ( "The generic type T must be an enumeration" ) ) ;
  }

  PIEBALD.Attributes.StringComparerAttribute[] comparers =
    (PIEBALD.Attributes.StringComparerAttribute[]) typ.GetCustomAttributes
    ( typeof(PIEBALD.Attributes.StringComparerAttribute) , false ) ;

  switch ( comparers.Length )
  {
    case 0 :
    {
      map = new System.Collections.Generic.Dictionary<string,SwitchDefinition>() ;

      break ;
    }

    case 1 :
    {
      map = new System.Collections.Generic.Dictionary<string,SwitchDefinition>
      (
        comparers [ 0 ].Comparer
      ) ;

      break ;
    }

    default:
    {
      throw ( new System.Exception
      (
        "The specified enum has more than one PIEBALD.Types.StringComparerAttribute"
      ) ) ;
    }
  }

The members of an enumeration are implemented as public static fields, we can use Reflection to discover them. We can then query for the Attributes for default values and help text (description).

foreach
(
  System.Reflection.FieldInfo fi
in
  typ.GetFields
  (
    System.Reflection.BindingFlags.Public
  |
    System.Reflection.BindingFlags.Static
  )
)
{
  System.ComponentModel.DescriptionAttribute[] descriptions =
    (System.ComponentModel.DescriptionAttribute[]) fi.GetCustomAttributes
    ( typeof(System.ComponentModel.DescriptionAttribute) , false ) ;

  if ( descriptions.Length > 1 )
  {
    throw ( new System.Exception ( System.String.Format
    (
      "Switch {0} has more than one System.ComponentModel.DescriptionAttribute"
    ,
      fi.Name
    ) ) ) ;
  }

  System.ComponentModel.DefaultValueAttribute[] defaults =
    (System.ComponentModel.DefaultValueAttribute[]) fi.GetCustomAttributes
    ( typeof(System.ComponentModel.DefaultValueAttribute) , false ) ;

  if ( defaults.Length > 1 )
  {
    throw ( new System.Exception ( System.String.Format
    (
      "Switch {0} has more than one System.ComponentModel.DefaultValueAttribute"
    ,
      fi.Name
    ) ) ) ;
  }

We can then instantiate a SwitchDefinition and put it into the map (Dictionary) of enumeration member names to switches.

    map [ fi.Name ] = new SwitchDefinition
    (
      (T) fi.GetValue ( null )
    ,
      ( descriptions.Length == 0 || descriptions [ 0 ].Description == null )
      ? null
      : descriptions [ 0 ].Description
    ,
      ( defaults.Length == 0 || defaults [ 0 ].Value == null )
      ? null
      : defaults [ 0 ].Value.ToString()
    ) ;
  }

  return ;
}

SwitchHelp is an enumerator to provide some simple help information about the supported switches.

public static System.Collections.Generic.IEnumerable<string>
SwitchHelp
(
)
{
  System.Text.StringBuilder sb = new System.Text.StringBuilder() ;

  foreach ( SwitchDefinition s in map.Values )
  {
    sb.Length = 0 ;
    sb.AppendFormat ( "{0:G}" , s.Identifier ) ;

    if ( !System.String.IsNullOrEmpty ( s.HelpText ) )
    {
      sb.AppendFormat ( " : {0}" , s.HelpText ) ;
    }

    sb.AppendFormat ( " (default = {0})" , s.Default==null?"<null>":s.Default ) ;

    yield return ( sb.ToString() ) ;
  }

  yield break ;
}

Instance members

The Parameter property gives access to the readonly List of parameter values and the Switch property gives access to the readonly Dictionary of switches and their values. Switch will always contain every supported switch whether the user specified it in the command or not. The Count property will contain the number of items (parameters and switches) tokenized from the command line; this may not equal the value of Parameter.Count plus Switch.Count, it could be greater or less. Count is really only useful for testing whether or not the user provided anything other than the command on the command line.

public int Count { get ; private set ; }

public System.Collections.Generic.IList<string>         Parameter { get ; private set ; }
public System.Collections.Generic.IDictionary<T,string> Switch    { get ; private set ; }

Here is the main constructor. There are a few overloads, but they all call this one. The parm variable will become the value of the Parameter property and the swit variable will become the Switch property.

public ParsedCommandLine
(
  string                    CommandLine
,
  ParsedCommandLine.Options Options
)
{
  System.Collections.Generic.List<string> parm =
    new System.Collections.Generic.List<string>() ;

  System.Collections.Generic.Dictionary<T,string> swit =
    new System.Collections.Generic.Dictionary<T,string>() ;

  bool exp = ( Options & ParsedCommandLine.Options.ExpandEnvironmentVariables ) ==
      ParsedCommandLine.Options.ExpandEnvironmentVariables ;

  bool upp = ( Options & ParsedCommandLine.Options.ToUpper ) ==
      ParsedCommandLine.Options.ToUpper ;

The first thing to do is to fill swit with the default values for all the switches. You could skip this step if you want Switch to contain only the switches from the command line. Or you could add only the switches that have non-null default values. I suppose I could have made more Options -- your feedback might be helpful.

foreach ( SwitchDefinition s in map.Values )
{
  swit [ s.Identifier ] = s.Default ;
}

Now use the Regular Expression to tokenize the command line. In this usage a token may be a parameter value or it may be a switch name and a value. Count will be set to the total number of tokens present.

System.Text.RegularExpressions.MatchCollection matches = reg.Matches ( CommandLine ) ;

this.Count = matches.Count ;

foreach
(
  System.Text.RegularExpressions.Match mat
in
  matches
)
{

Each token will have a Name and a Value; either or both may be an empty string. If the Name is empty, then the token is a parameter. If the Name is not empty then we need to match it to the names of the switches. BestMatch will throw an Exception if the Name doesn't match exactly one switch name. If a match is found we overwrite the current value for the switch; in this way only the last (right-most) value for a switch is returned.

  string nam = mat.Groups [ "Name"  ].Value ;
  string val = mat.Groups [ "Value" ].Value ;

  if ( exp )
  {
    val = System.Environment.ExpandEnvironmentVariables ( val ) ;
  }

  if ( upp )
  {
    val = val.ToUpper() ;
  }

  if ( nam.Length == 0 )
  {
    parm.Add ( val ) ;
  }
  else
  {
    swit [ map.BestMatch ( nam ).Identifier ] = val ;
  }
}

Promote parm and swit. Done

  this.Parameter = parm.AsReadOnly() ;
  this.Switch    = swit.AsReadOnly() ;

  return ;
}

These are the indexers that may be controversial. They really don't save anything and they may cause a little confusion, but I like being able to refer to the parameters as if they were in the standard string array. If you don't like them, you can remove them.

public string
this
[
  int Index
]
{
  get
  {
    return ( this.Parameter [ Index ] ) ;
  }
}

public string
this
[
  T Key
]
{
  get
  {
    return ( this.Switch [ Key ] ) ;
  }
}

I like to throw in Dispose even if the class uses only managed resources.

public void
Dispose
(
)
{
  this.Parameter = null ;
  this.Switch    = null ;

  return ;
}

ParsedCommandLine

This class is provided mainly for convenience. It contains the definition of the Options enumeration. It can be used when an application doesn't use switches.

public partial class ParsedCommandLine : ParsedCommandLine<PIEBALD.Types.EmptyEnum>

The demo

The provided demo is a simple FizzBuzz implementation that uses the number of parameters as its input. Of course it also needs to demonstrate a switch or two so I allow the user to provide different settings for Fizz and Buzz.

I showed some parts of this code earlier, but here is the whole thing.

namespace PIEBALD
{
  public static partial class PCLdemo
  {
    [PIEBALD.Attributes.InvariantCultureIgnoreCaseAttribute]
    private enum FizzBuzz
    {
      [System.ComponentModel.DescriptionAttribute("Display Fizz this often")
      ,System.ComponentModel.DefaultValueAttribute("3")]
      Fizz
    ,
      [System.ComponentModel.DescriptionAttribute("Display Buzz this often")
      ,System.ComponentModel.DefaultValueAttribute("5")]
      Buzz
    }

    [System.STAThreadAttribute()]
    public static int
    Main
    (
    )
    {
      int result = 0 ;

      try
      {
        using
        (
          PIEBALD.Types.IParsedCommandLine<FizzBuzz> args 
        =
          new PIEBALD.Types.ParsedCommandLine<FizzBuzz>
          (
            PIEBALD.Lib.LibSys.CommandLineWithoutExecutableName
          ,
            PIEBALD.Types.ParsedCommandLine.Options.ExpandEnvironmentVariables
          ) 
        )
        {
          if ( args.Count > 0 )
          {
            int f = System.Int32.Parse ( args.Switch [ FizzBuzz.Fizz ] ) ;
            int b = System.Int32.Parse ( args.Switch [ FizzBuzz.Buzz ] ) ;

            if ( args.Parameter.Count % f == 0 ) System.Console.Write ( FizzBuzz.Fizz ) ;
            if ( args.Parameter.Count % b == 0 ) System.Console.Write ( FizzBuzz.Buzz ) ;
          }
          else
          {
            System.Console.WriteLine ( "Syntax: PCLdemo [/Fizz=f] [/Buzz=b] parameter [...]" ) ;

            foreach ( string s in PIEBALD.Types.ParsedCommandLine<FizzBuzz>.SwitchHelp() )
            {
              System.Console.WriteLine ( s ) ;
            }
          }
        }
      }
      catch ( System.Exception err )
      {
        System.Console.WriteLine ( err ) ;
      }

      return ( result ) ;
    }
  }
}

All the files required are provided in the ZIP file. Compilation can easily be accomplished at the command line: csc /recurse:*.cs. If you don't know how to do this I have some more information here: On compiling at the command line[^]

C:\Projects\CodeProject\ParsedCommandLine>csc /recurse:*.cs
Microsoft (R) Visual C# 2010 Compiler version 4.0.30319.1
Copyright (C) Microsoft Corporation. All rights reserved.

C:\Projects\CodeProject\ParsedCommandLine>PCLdemo
Syntax: PCLdemo [/Fizz=f] [/Buzz=b] parameter [...]
Fizz : Display Fizz this often (default = 3)
Buzz : Display Buzz this often (default = 5)

C:\Projects\CodeProject\ParsedCommandLine>PCLdemo a

C:\Projects\CodeProject\ParsedCommandLine>PCLdemo a b

C:\Projects\CodeProject\ParsedCommandLine>PCLdemo a b c
Fizz
C:\Projects\CodeProject\ParsedCommandLine>PCLdemo a b c d

C:\Projects\CodeProject\ParsedCommandLine>PCLdemo a b c d e
Buzz
C:\Projects\CodeProject\ParsedCommandLine>PCLdemo a b c d e f g h i j k l m n o
FizzBuzz
C:\Projects\CodeProject\ParsedCommandLine>PCLdemo a b c d /f=2 /b=4
FizzBuzz
C:\Projects\CodeProject\ParsedCommandLine>PCLdemo /b=4 a b /f=2 c d /f=5
Buzz
C:\Projects\CodeProject\ParsedCommandLine>PCLdemo /b=7
FizzBuzz
C:\Projects\CodeProject\ParsedCommandLine>

Points of Interest

Documented here: http://www.codeproject.com/Lounge.aspx?msg=4201677#xx4201677xx[^]

History

2013-02-09 First submitted

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)