Languages that support polymorphism seem to require functions to model polymorphic requirements. This suffices in most cases, but there are cases where only the data is polymorphic - the data take a different form in a different context. While modeling data polymorphism is possible through the use of templates (to be discussed in a later post), this article makes a case for the language to allow interfaces to have abstract data members, whose types are unknown in the base class and will be meaningful only in the derived context.
Interfaces (and Abstract Base Classes) are an expressive notation for mandating the implementation of required functions in derived classes. Why not utilize the same notation to accommodate data as well? In a nutshell, the following is possible:
virtual void pureVirtualFunction()=0;
But the following is not:
What does this imply? Could there be cases where such a feature would help?
Some months back, I encountered a requirement which I thought really called out for more direct support for data polymorphism in the language. In this article I'll explain the problem in a generalized way, leaving out some domain-specific details. Please do let me know if more details are needed to understand the problem and why the limitations mentioned are significant.
An algorithm needs to be implemented on the transmissions of an existing client-server system to improve its performance. Broadly, the algorithm transparently intercepts the packets, applies some transforms based on the packet and relays them.
In order to implement the algorithm, components should be deployed at both the server and the client. The algorithm depends on the functionality in a packet and a factor which is determined by whether the component is at the server or at the client.
Design with Behavioural Polymorphism
Packet is a concrete class which provides a buffer-storage area and some domain-specific functionality.
virtual double toDouble();
virtual int toInteger();
The algorithm depends on
Packet functions to transform the packets with a factor which is based on
Packet::toDouble() and on
Packet::toInteger(). Both these conversions are to be done slightly differently from the implementations in the
Packet class and, also differently at client- and server-sides.
We have the inheritance:
ServerPacket: public Packet, and we reimplement
toDouble for server
ClientPacket: public Packet, and we reimplement
toDouble for client.
Algorithm components only need to apply some transforms on Packets. We generalize to the abstraction,
The component implementing the algorithm only differs slightly on the server and client sides. But they do differ and so we have the specializations,
ServerComponent: public AlgorithmComponent
ClientComponent: public AlgorithmComponent
It worked! But...
It turned out that both the required conversion functions were very expensive, leading to erratic performance. All that was really required was a one-time computation of the conversion, which could be done in the constructors of the specialized
Packets. Saving these in a member variable would solve the performance issues. The
ClientComponent and the
ServerComponent could then use this saved value directly. But it is not possible for the generalized
AlgorithmComponent to access a data member of the generalized
Packet class through a
Packet reference and get the implementations of the derived classes.
Consider the code snippet below:
class Derived: public Base
cout << b.i ;
This means that the abstraction of
AlgorithmComponent would collapse - breaking the class hierarchy.
And a Trivial(?) Violation of a Design Constraint
AlgorithmComponent necessarily has to be either server-side or client-side. It was just an abstraction of design which accurately summarized the operation of the algorithm.
AlgorithmComponent should not be instantiable, it is an abstract base class by nature.
Implementing this particular design constraint is possible if we make the functions which depend on
virtual. Technically, for
AlgorithmComponent to be
abstract, we need to make only one of the
virtual. But that would not accurately model the dependency on
Packet for the other functions.
Functions that depend on
Packet can be completely specified. We could avoid code duplication if we code them in the base class itself. The abstractness of
AlgorithmComponent is because of its dependency on
Packet is really what should be pure virtual.
What we should have had was an
AlgorithmComponent with completely specified functions but with an unspecified
Packet that had to be compulsorily supplied by the derived class.
What if, in the discussed example,
ServerComponent depended on a
double/class Foo and
ClientComponent depended on an
int/class Bar? Shouldn't abstract base classes be allowed to have undefined data members? Or equivalently, why not allow data members in interfaces?
Workarounds to these do exist, for example:
- with templates (the discussion stopped just short of it, don't you think?), but it would be a compile-time solution. This will be discussed in a later article.
- with composition instead of inheritance (which is how we chose to implement it finally), but lacks the simplicity and elegance of the explained design along with code duplication and what not.
Would it not be simpler and cleaner to express such a design if the language just allowed the data also to be truly polymorphic?
- Thomas Jay Cubb