I have often found myself writing code to validate variables that are passed from one environment to another. The environments can vary from processing command line arguments, data structures sent via TCP/IP, user data entered through dialog box controls or HTTP web forms, etc. The validation code usually goes like this:
if (angleOfArrival < MIN_ANGLE || angleOfArrival > MAX_ANGLE )
throw std::range_error("Foo::validate(): Angle of Arrival value out of range");
I can quickly figure out that the angle value must be within
MAX_ANGLE; otherwise, an exception will be thrown specifying the cause:
Foo::validate(): Angle of Arrival value out of range
All well and good if that is all the code I have to write and I don't need to check more variables. If not, then writing code like this suddenly becomes very tedious and prone to errors. I've seen places where one has to verify an average of over 10 variables in over 100 classes. Ideally, one would like to do this as painlessly as possible without compromising data integrity.
What happens if one decides not to throw an exception, but to flag an error, display the error, and continue processing the rest of the variables? As far as I know, C++ does not have some thing like
On Error Resume Next as in Visual Basic. So, that would require going through all the code and making the laborious but necessary changes.
What if we wanted to show the invalid value that was received? Sorry, you don't pay me enough to go through all that 10000+ code lines, making those changes.
Wouldn't it be nice to be able to write:
verify( angleOfArrival, in_range(MIN_ANGLE, MAX_ANGLE), "Angle of Arrival" );
and still get more or less the same effect? These set of utilities allows one to do this and a whole lot more.
Before I go on, let's see the error message when the nice code is used:
Foo::validate(): Angle of Arrival=387, must be in range[0, 360]
The benefits of using the nice code can be summarised as follows:
- Code is compact but readable - one line with very little code - less places for errors.
- Code is easier to understand - almost reads like a comment line - who says C++ is cryptic?
- Error message contains more detail - aids debugging, especially when data is transmitted from different media. We have not only shown what the wrong value was, but we have also shown what the requirements of the values are. In this example, we know for sure whether we should ask the sender to send correct values or that our requirements are wrong.
- Easily maintainable - one can easily swap between throwing an exception and displaying messages. In fact, we can totally customise what to do on error, without touching the
verify statement above.
How'd they do that?
Before one can write the nice code, we need to have declared a
typedef for the
validator<> template class, as follows:
typedef validator<> MyValidator;
The best place for this declaration is either in a common header file or at the top of the .cpp file. You don't need to supply any types for the
validator<> template. The default template parameter type will make the
validator throw an exception on failure.
You may also want to say
using namespace mkn::validation to simplify the use in a .cpp file. Then, at the top of each function or method that does validation, you must declare:
I prefer to call this
validator<> instance variable
verify but you can call it anything you like. Unfortunately, the
__FUNCTION__ macro is not standardized across compilers, so you have to provide a value that indicates the context of where your variable is instantiated.
If we decide not to throw an exception but print the error to the standard output, we just need to change the
typedef validator< archives_to < std::ostream > > MyValidator;
If we decide to do something completely different, e.g., highlight the affected text box, we can, with little extra effort, do that without touching the nice code.
A brief look at the insides
To exploit the full power of these utilities, you need to understand a little bit about how they work together. There are three key components that make up the validation of a variable:
- A Constraint - what the value must meet in order to validate
- Failure Handler - what to do when validation fails
- Validator - how and when to do the validation
A constraint, in this context, refers to the condition/constraint that a variable's value must meet in order for it to successfully validate. In our example, the constraint used is
range. All pre-defined constraints reside on the
mkn::constraints namespace, and are constructible using one or more maker functions that make the code more English-like, in other words, self commenting. You may also want to say
using namespace mkn::constraints at the top of the .cpp file for ease of use. Here is a list of the pre-defined constraint maker functions in the
in_range() - checks if a value is in a range - an alternative way of using this is
is_any_of() - checks if a value is any of a list of values in the parameter list
is_one_of() - checks if a value is in a std compliant container
Standard binary function derived constraints
These are constraints derived from template classes in the
<functional> header. They are analogous to
std::binder2nd<std::binary_function_name>. The maker functions for these are:
Constraints can also be combined with each other using || or && operators. Using our example, if we were to allow a value that signifies that the angle is not applicable (e.g.,
ANGLE_NA=777), we would rewrite the
verify statement as:
verify( angleOfArrival, equals(ANGLE_NA) || in_range(MIN_ANGLE, MAX_ANGLE),
"Angle of Arrival" );
The error message on failure becomes:
Foo::validate(): Angle of Arrival=387, must be equal to 777 or must be in range[0, 360]
You can also combine binary function derived constraints with a logical binary operator and other values that you want to compare against. So, instead of writing:
if (!(clientID == 30 || clientID == 43 || clientID == 50 || clientID == 57))
throw runtime_error("Foo::validate(): Invalid client ID");
you could write:
verify( clientID, equals(30) || 43 || 50 || 57, "clientID" );
and the resultant error message would be something like:
Foo::validate(): clientID=27, must be equal to 30 or 43 or 50 or 57
Negating a constraint
Any constraint can be negated using
operator!(). So, you can write something like:
verify( clientID, !equals(27), "clientID" );
And the error message would be:
Foo::validate(): clientID=27, must not be equal to 27
A failure handler is a functor that accepts the failed value, a constraint, a value identifier, and an index of the failed value. It gets this information from the validator instance, and uses it on whatever is defined in the functor operator. Pre-defined failure handlers and their actions are:
throws<Exception> - Throws an exception instance of
Exception with error details. If no template parameter is specified, it throws the exception defined in the failed constraint or
archives_to<Stream> - Insert/prints error detail on an instance of
reports_to<Reporter> - Similar to
archives_to, but uses a functor operator on
Reporter instead of the
<<() operators. The
Reporter must be a functor that accepts a
Unfortunately, you cannot use maker functions for types, but I have tried to make them easier to use, with intuitive names.
validator<FailureHandler, ValueId=const char*> is the main class that ties constraints and failure handlers together. Each validator is tied to a
FailureHandler by definition, and can use one or more constraints during its lifetime.
There are four ways of using a
- The normal way - the examples given above use the instance this way.
- Bound to a variable - e.g.:
verify( size, "size" ) == sizeof(Foo);
- Validating conditionally - e.g.:
verify.when( angle, !equal(ANGLE_NA)).validate(in_range(MIN_ANGLE, MAX_ANGLE),
- Validate constrained variables - e.g.:
range_constrained< int, 0, 100 > trackID;
verify( trackID );
I will talk more about constrained variables in the next article update.
What does it cost (performance)?
I haven't done any performance tests, but VC++ 7.1 release build inlines most of the constraints code away, and the bulk of the code generated depends primarily on the failure handler used. The context specifier uses a
const char* , and is stored as such, hence there is no memory allocation penalty. Heap allocated strings for error messages are only constructed once an error has occurred, and therefore should not impact performance when there are no errors.
Plans for the next article update (maybe Part II)
- Add string based constraints.
- Talk more about customizing the failure handlers.
- Details on constrained variables.
- Update performance details.
I would like to thank Werner Erasmus for his critique during the construction of these utilities.