Variable Argument Functions

abc876

4.45/5 (36 votes)

May 21, 2003

7 min read

386037

842

This article explains how to effectively use variable argument functions and what's going behind the scenes

Download source files - 49 Kb

Introduction

Functions are very commonly used in C++ and they provide very good modular design. Normally, every function has fixed number of arguments or parameters (specified during function declaration). You also need to specify the data type for these arguments. Every function has either a single or none variable or constant (constructors and destructors) as return type. A typical look of C++ function is like

ReturnType function( Datatype var1, Datatype var2)

Have you ever encountered functions with variable number of arguments? Most commonly used is printf. If you have some UNIX background, then you must have seen functions like

 int execl(const char *path,const char *arg,...);
 int execlp(const char *file,const char *arg,...);

The exec calls provides the means of transforming the calling process into a new process. These functions can take variable number of argument and the last argument is NULL. For example,

execl("/bin/echo", "echo", "Ghulam", "Ishaq",
        "Khan", "Institute", NULL);

I am passing 7 arguments to this function including the last NULL. Now see this function call.

execl("/bin/echo","echo","Code”,“Project”,NULL);

Now I am passing 5 arguments to this function. This function takes variable number of arguments. These functions are defined in unistd.h (a Unix header file). A flavor of these functions is also available on Windows like _execl, _execlp defined in process.h. printf is also an example of variable argument function.

How to define functions with variable number of arguments and how to access these arguments at runtime?

How to define and use Variable argument functions?

Well, first of all the functions with variable number of arguments should have at least one fixed argument. If this placeholder argument is not provided, there is no way to access other arguments. The last argument of this function should be ellipsis (…) which means “and may be more arguments”.

For example a function which adds N integers will have a declaration of

int Add( int first,...);

Note : it is Microsoft C++ specification that last argument before ellipsis should have ‘,’ before the ellipsis. Hence according to Microsoft C++, int add( int x,...) is legal but int add( int x ...) is not legal.

Now how to access these variable numbers of arguments?

This support is provided in standard header. It contains macros to access these variable arguments.

Variable argument functions should have one terminating argument. Although it is not necessary but it should be there to know that function arguments have ended as in the above case of execl function, it is NULL. Let’s see how to access these arguments.

 int add(int x, ...) 
 {
    va_list list;
   //initialize the va_list i.e char* 
   // variable 'list' by the address
   // of first unknown variable argument
   // by a call of va_start() macro
    va_start(list,x); 
    int result=0;

    for(;;)
    { // in loop, retreive each argument
      // Second argument to va_arg is datatype
      // of expected argument
        int p=va_arg(list,int);
    if(p==0)
       break;
    result+=p;
     }
    va_end(list); // cleanup , set 'lsit' to NULL
    return result;
  }

First of all, we define a va_list variable and it is initialized by a call of va_start(). va_start() is a macro which takes the name of of va_list variable and the name of last formal function argument as its arguments. va_arg() macro is then used to obtain unknown variable arguments. In each call, user has to specify the type of argument expected. va_arg() macro assumes that an actual argument of that type has been passed. But there is no way of ensuring that. Since here we are dealing with addition of integers , so in all calls it is 'int'. After retrieving argument value, it is checked, if it is 0 which we have assumed to be the terminating argument. When the argument is 0 , we exit the loop and return the result. But before returning the result, we must call va_end() macro passing it the name of the va_list variable. The reason is that va_start() may modify the stack in such a way that return cannot successfully be done, so va_end() macro undoes all such modifications to the stack.

That’s all about variable argument functions? Well, its not that simple. There are some other things which you should keep in mind using these functions.

Variable Promotions

If an argument has not been declared, the compiler doesn’t have any information needed to perform standard type checking and type conversion for it. Since use of ellipsis doesn’t impose type safety which is one of the major goals of C++, it is sometimes considered a bad programming practice. But at times, you may need to use variable argument function particularly dealing with old C style functions. Different conversions are applied to functions declared with ellipses than to those functions for which the formal and actual argument types are known.

If the actual argument is of type float, it is promoted to type double when function is to be made.
Any signed or unsigned char, short, enumerated type, or bit field is converted to either a signed or an unsigned int using integral promotion.
Any argument of class type is passed by value as a data structure; the copy is created by binary copying instead of by invoking the class’s copy constructor (if one exists).

So, if your argument types are of float type, you should expect the argument retrieved to be of type double and it is char or short, you should expect it to be signed or unsigned int. For example, this code will give you wrong results.

float add(float x,...) 
{
  va_list list;
  va_start(list,x);
  float result=0;

  for(;;)
  {// I am passing float as expected type
   // in va_arg, but actually float has been 
   //promoted to double.
   float p=va_arg(list,float);
   if(p==0)
      break;
   result+=p;
  }
  va_end(list);
  return result;
}

The reason is that size of float and double is different. Compiler is passing variables as double, but you are specifying type to be float. When it will increment the “list” to point to next argument by adding the size of float, it will be pointing to wrong data. Correct way is to specify the type to be double. Then you can type cast it to float ( if you want).

 float add(float x,...)
 {
    va_list list;
    va_start(list,x);
    float result=0;

    for(;;)
    { 
     // Note: i am passing double as expected 
     // datatype of argument, where actual
     // input was of type float. That is due to
     //variable promotion 
      float p=(float) va_arg(list,double);
      if(p==0)
    break;
      result+=p;
    }
    va_end(list);
    return  result;
}

In similar manner, you should cater for other variable promotions.

What is going on behind the scenes?

Now let’s see what is actually happening behind the scene and how is compiler retrieving these values.

va_list is a typedef for list of variable arguments. It is defined in stdio.h as

typedef char* va_list

va_start is a macro which initializes the va_list by adding the size of the second argument v (which is the last fixed argument of function) to the address of second argument v. So it now points to the first unknown argument. va_start is defined as

#define va_start(ap,v)(ap=(va_list)&v+_INTSIZEOF(v))

If ‘v’ is declared with the register storage class, the macro’s behavior is undefined. va_start must be used before va_arg is used for the first time

va_arg is a macro which retrieves the value of next unknown variable argument. Here is its declaration.

#define va_arg(ap,t)(*(t*)((ap +=_INTSIZEOF(t))-_INTSIZEOF(t)))

This is bit tricky macro. (ap += _INTSIZEOF(t)) results in incrementing the pointer to point to next argument address of type t. Let’s say result of this operation is X. Now macro becomes

#define va_arg(ap,t)(*(t*)(X-_INTSIZEOF(t)))

(X - _INTSIZEOF(t)) results in the same value as of ap before incrementing ap operation i.e ap += _INTSIZEOF(t). Lets say it is Y. Then macro becomes

#define va_arg(ap,t) ( *(t *)Y )

Now type casting it to t* and retrieving the value from the location pointed by Y gives us the next argument.

Now when next call to this macro is made, we will get the next argument as ap has already been incremented.

va_end() macro simply sets the pointer to NULL as shown below.

#define va_end(ap) ( ap = (va_list)0)

Another Problem: What if last known parameter is a reference type?

If the last known variable is reference type then we will get into a problem which can’t be solved by using these macros. Let me emphasize it more.

We are calculating the address of first unknown variable argument using last known variable. We are simply adding the size of first variable argument’s data type to the address of last known variable of the function to get address of first variable argument. In C++, applying the "address of" operator to a reference type, results in a pointer to the object that is being referred to. The va_start macro takes the address of the last named parameter to locate subsequent parameters. When the last named parameter is a reference, this causes problems because the macro is no longer referring to the current call stack but whatever follows the object being referred to, which could be a previous call stack or a global memory object.

#define va_start (ap,v)(ap = (va_list)&v + _INTSIZEOF(v))

To solve this problem, use following macros in your code instead of these already defined macros.

#ifdef va_start
#undef va_start

#ifdef _WIN32
#define va_start(ap,v){int var=_INTSIZEOF(v);\
  __asm lea eax,v __asm add eax,var \
   __asm mov ap,eax  \
    }
#else
#define va_start(ap,v){int var=_INTSIZEOF(v);\ 
  __asm lea ax,v __asm add ax,var \
  __asm mov ap,ax\ 
    }
 #endif
 #endif

we are using inline assembly within our code. First we load the effective address of v in a register using lea instruction (note by loading the effective address or the actual address, we are removing that problem), add size of data type to it and store it again in “ap” argument pointer. Example of using this macro is attached with this article ( varargref_src.zip ).

Use of variable argument functions is not a very good programming practice. According to Bjarne Stroustrup ( The C++ Programming Language, Second Edition) : -

“A well defined program needs at most few functions for which the argument types are not completely specified. Overloaded functions and functions using default arguments can be used to take care of type checking in most cases when one would otherwise consider leaving argument types unspecified. Only when both the number of arguments and the type of arguments vary is the ellipsis necessary”

Well, that’s all about variable argument functions. I hope you enjoyed reading this article :)