Click here to Skip to main content
15,077,641 members
Articles / Web Development / HTML
Article
Posted 23 Apr 2015

Tagged as

Stats

17.2K views
54 downloads
11 bookmarked

Comparing covariance/contravariance rules in C#, Java and Scala

Rate me:
Please Sign up or sign in to vote.
4.71/5 (3 votes)
23 Apr 2015CPOL16 min read
Different programming languages support variance (covariance/contravariance) in different ways. The goal of this article is to compare all supported types of variance in C#, Java and Scala in one place, and to reason about why some architectural decisions have been made by language designers.

Introduction

I assume that the readers are familiar with the basic concepts of covariance and contravariance. Still I can't omit the definitions here, though it's clear to me that the point of variance is not that easy to grasp without examples. All right, there are a lot of easy-to-grasp examples ahead.

A complex type F(T) is covariant on a type-parameter T, if the fact that A is a subtype of B implies that F(A) is a subtype of F(B).

A complex type F(T) is contravariant on a type-parameter T, if the fact that A is a subtype of B implies that F(B) is a subtype of F(A).

A complex type F(T) is invariant on a type-parameter T, if it is neither covariant nor contravariant on T.

Running-ahead-of-myself hint: a complex type F here can be array, generic-type and more.

Table of contents

Here I am providing the brief summary of points to be discussed in the article, including the notes of whether a language supports a particular variance feature or not:

 

C#

Java

Scala

Arrays covariance

+

(unsafe at runtime)

+

(unsafe at runtime)

_

(arrays are invariant by design)

 

Though, there is support for Java's "covariant" arrays, of course.

Arrays contravariance

_

_

_

Generics variance

(covariance/contravariance)

+

Defined by a generic type creator (definition-site).

 

(Restricted to generic interfaces and generic delegates)

+

Defined by clients of generic type using wildcards (use-site).

+

Defined by a generic type creator (definition-site).

 

Also, there are existential types that cover Java's wildcards functionality.

Overriding: return type covariance

_

+

+

Overriding: parameter type contravariance

_

_

_

When reading the article, you can dive directly into a language you are interested in, but I would recommend you to read it all: in this way you will grasp the general concepts better.

Comparing covariance/contravariance rules

Arrays covariance

C#

Let’s consider the following example:

C#
Cat[] cats = new Cat[] { new Cat(), new Cat() };
Animal[] animals = cats; //*
animals[0] = new Dog(); //**runtime(not compile-time) error here.

This code compiles without errors. It means that arrays are covariant in C#, because we can use Cat[] array where Animal[] array is expected (see * line).

But it's obvious that the last line (**) is going against common sense here. And indeed, the code fails at execution time with ArrayTypeMismatchException. So, formally, C# supports array covariance, but it is not safe and not enforced fully by the compiler. Support for this kind of covariance was added mainly because Java supported it. At that time it was important for C# to be very close to Java to spread the new language widely across the Java community. Now roads of the two languages have diverged much, but supporting for the "broken" array covariance goes deeply in CLR and probably will never be changed.

Java

The same code has the similar behavior in Java as in C#, except that we would get Java's ArrayStoreException at runtime.

Scala

Java's arrays (internally) are represented not as a single type, but as nine different ones: one for array of references and 8 more for each primitive type (int, short, float, etc.). For Scala language designers it was real challenge to support interoperation with Java, and, at the same time, incorporate arrays into Scala's rich collections hierarchy. As a result, Scala's arrays are represented by generic Array[T]which is mapped directly to the Java's arrays T[]. They have the same representation in bytecode, that is why you can pass arrays between Java and Scala in either direction.  

We'll discuss generics variance later, but now let's try to understand why Scala's architectures have decided to make Array[T] invariant. Consider the following code in Scala:

Java
val cats: Array[Cat] = Array[Cat](new Cat(), new Cat())
val animals: Array[Animal] = cats //*compile-time error here
animals.update(0, new Dog())

By the virtue of the Scala's compiler, we get compile-time error here (line *). Otherwise it would be possible to break type safety like we could in Java (and C#). Interesting, that not all Scala's collections have the same behavior. Let's change Array to List in our example:

Java
val cats:List[Cat] = List[Cat](new Cat(), new Cat())
val animals:List[Animal] = cats //OK
val newAnimals = animals.updated(0, new Dog())

This code compiles in Scala and also absolutely safe at execution time. How and why it's possible to do it with List and impossible with Array? The answer is mutability. Arrays in Scala are mutable, so it's impossible to guarantee type safety due to the same reasons it's impossible with arrays in Java (C#), which are mutable as well. On the other hand, Scala's List is immutable (Scala has mutable lists too, but we are talking about the immutable one here). So, when updating element in the list, actually we are creating new List containing elements from the old one (with a newly updated element). In other words, for immutable lists there is a guarantee that List can't be updated in place causing inconsistency at runtime.

A brief summary of array covariance:

Only read-only (immutable) arrays can be truly covariant. But they are not immutable. When we update something in an array we don't get a new array, we just update the target array in place. That's why it's impossible to make arrays truly covariant, at the same time providing safety at runtime. Language designers need to make not an easy choice about that. And it's the matter of dispute what's better, to support "broken" covariance for arrays (like C# and Java do), or to make the deliberated decision to make them invariant (Scala).

Arrays contravariance

C#/Java/Scala

The all 3 languages do not support array contravariance. And even though covariance (not safe at runtime) is supported for arrays in C# and Java – it would have been impractical to allow contravariance for arrays. Let's try to find out why.       

Let's imagine for a minute that the following code works (really it does not):

C#
Animal[] animals = new Animal[] { new Cat(), new Cat() };
Dog[] dogs = animals; //compile-time error here, but let's imagine that it works           
dogs[0] = new Dog();

The actual type of array is Animal (it can contain both cats and dogs), so, from the data-changing point of view, there is nothing terrible that we can update one element to contain Dog instead of Cat, even via dogs variable. But how can we read an element from Animal array via dogs variable, if we have no compile-time guarantee that there are no Cats in our array? May be language designers could have implemented some workaround, e.g. to fail at runtime when an incompatible read operation is performed, but that kind of array essentially would be useless. So, our conclusion would be:

Only write-only arrays can be contravariant.

Generics variance

C#

Generics covariance in C#

Let's consider the following code: 

C#
interface IAnimalFarm<out T> where T: Animal
{
   T ProduceAnimal();
}
                     
class CatFarm : IAnimalFarm<Cat>
{
  public Cat ProduceAnimal()
  {
      return new Cat();
  }
}

Now we are ready to try out the example with generics covariance at work:

C#
IAnimalFarm<Cat> catFarm = new CatFarm();
IAnimalFarm<Animal> animalFarm = catFarm; //* OK, because covariant
Animal animal = animalFarm.ProduceAnimal();

This code (all the attention on the line marked with *) compiles without problems. It means that the compiler guarantees that it's safe to work with CatFarm via animalFarm variable. And indeed, what's wrong could happen if we call ProduceAnimal that has Animal return type, when the actual type of a returned object is Cat? The answer is nothing, because, thanks to assignment compatibility, it's OK to assign a value of a more specific type (Cat) to a variable of less specific (Animal).

Basically, in order to be covariant on a generic type parameter, type should contain the generic parameter only in output positions. In our example, it means that in order to be covariant IAnimalFarm should contain the generic type parameter T only as outputs of the methods.

Why we have such a restriction?

Consider the following hierarchy, where the generic type parameter T is presented both in output and input positions of methods of IAnimalFarm interface:

C#
interface IAnimalFarm<T> where T : Animal
{
  T ProduceAnimal();
  void FeedAnimal(T animal);
}

class AnimalFarm : IAnimalFarm<Animal>
{
  public Animal ProduceAnimal()
  {
     return new Animal();
  }
              
  public void FeedAnimal(Animal animal)
  {
     //feed animal
  }
}

class CatFarm : IAnimalFarm<Cat>
{
  public Cat ProduceAnimal()
  {
     return new Cat();
  }
               
  public void FeedAnimal(Cat animal)
  {
     //feed cat
  }
}

Imagine if covariance supported for IAnimalFarm. It would mean that the following code is legal:

C#
IAnimalFarm<Cat> catFarm = new CatFarm();
IAnimalFarm<Animal> animalFarm = catFarm; //* compile-time error, but imagine it works
animalFarm.FeedAnimal(new Dog());

We are working with CatFarm via animalFarm variable(*) here. It seems OK. But then we are trying to feed a Dog object via animalFarm variable (where an underlying type of object is CatFarm). So, basically we are trying to feed the dog on a cat's farm – the dog would not be happy. Each line in this sample looks reasonable, but in conjunction they produce the unsafe behavior.

As you can see, the reason for a compile-time restriction for a generic type parameter position (only outputs) is clear: to provide run-time safety. As you remember, in case of arrays it was decided to support covariance, even though, an element of an array can be both in input and output positions of array's operations, paying runtime safety for it. In case of generics, you have compile-time support, but get certain inflexibility instead.

Generics contravariance in C#

Let's consider the following code:

C#
interface IAnimalFarm<in T> where T : Animal
{
   void FeedAnimal(T animal);
}

class AnimalFarm : IAnimalFarm<Animal>
{
   public void FeedAnimal(Animal animal)
   {
      //feed animal
   }
}

And the following code with contravariance at work:

C#
IAnimalFarm<Animal> animalFarm = new AnimalFarm();
IAnimalFarm<Cat> catFarm = animalFarm; //OK, because contravariant
catFarm.FeedAnimal(new Cat());

This code compiles without problems. It means that the compiler guarantees that it's safe to work with AnimalFarm via catFarm variable. And indeed, nothing wrong could happen if we call FeedAnimal passing Cat object as a parameter. AnimalFarm's FeedAnimal expects Animal object, but it's OK to pass a more specific object (Cat) to it, thanks, again, to the assignment compatibility.

In order to be contravariant on a generic type parameter, type should contain the generic parameter only in input positions. In our example, it means that in order to be contravariant IAnimalFarm should contain the generic type parameter T only as inputs for methods.

Why such a restriction?

Consider again the following hierarchy, where the generic type parameter T is presented in both output and input positions of methods of IAnimalFarm interface:

C#
interface IAnimalFarm<T> where T : Animal
{
  T ProduceAnimal();
  void FeedAnimal(T animal);
}

class AnimalFarm : IAnimalFarm<Animal>
{
  public Animal ProduceAnimal()
  {
     return new Animal();
  }
              
  public void FeedAnimal(Animal animal)
  {
     //feed animal
  }
}

class CatFarm : IAnimalFarm<Cat>
{
  public Cat ProduceAnimal()
  {
     return new Cat();
  }
              
  public void FeedAnimal(Cat animal)
  {
     //feed cat
  }
}

Imagine if contravariance supported for IAnimalFarm. It would mean that the following code is legal:

C#
IAnimalFarm<Animal> animalFarm = new AnimalFarm();
IAnimalFarm<Cat> catFarm = animalFarm; //* compile-time error, but imagine it works
Cat animal = catFarm.ProduceAnimal();

We are working with AnimalFarm via catFarm variable and then trying to produce Cat. But an underlying object we are working with is of type AnimalFarm, so animal farm can produce only some abstract Animal, but not a concrete Cat by no means. Again, each line is reasonable, but in conjunction they produce the unsafe behavior.

Important notes on a few C# variance limitations
  • Generic type variance is restricted to generic interfaces and generic delegates.

  • Variance applies only for generic type parameters of reference types.

Let us think a bit about why these limitations exist.

So, why generic classes are invariant in C#? As you understand, a class needs to contain only output method parameters (to be covariant) and to contain only input method parameters (to be contravariant). The point is that it's hard to guarantee that for classes: for example, covariant class (by T type parameter) cannot have fields of T, because you can write to those fields. It would work great for truly immutable classes, but there is no a comprehensive support for immutability in C# at the moment. But, honestly, I have a feeling that we may expect a better support for it in future.

Why value types are not supported in generics variance? The short answer is that variance only works when the CLR does not need to make changes (conversions) to values of generic type parameters. The conversions are divided into representation-preserving and representation-changing. An example of the representation-preserving conversion is casting operation on a reference: you are not changing the origin object (where the reference points to) when you perform casting; you just verify that the object is compatible with applied type and get new reference. Examples of representation-changing conversions are user-defined conversions, conversion from int to double, boxing and unboxing. For the CLR all references look the same - it's just an address of real object in memory (32 or 64 bits depending on a machine). That's why it can use IAnimalFarm<Cat> instead of IAnimalFarm<Animal> without changes in data-representation. You can't say the same about some of value-type conversions (boxing/unboxing, for instance), that's why variance would not work, for example, between IEnumerable<int> and IEnumerable<object>. In other words, the easiest way to guarantee that variant conversions are representation-preserving is to allow them only for reference types.

Quick note about Java/Scala (getting ahead of myself): generics in Java and Scala are completely compile-time construct. There is no info about generic type parameters preserved at run-time due to a type erasure process. All generic parameters are maintained as Object (reference types), including value-types (primitives). That's why there are no problems with the value-types data-representation in Java/Scala – every reference looks the same. It's one of a few advantages of using type erasure (JVM) comparing with reified generics (CLR).

Java

Java has another solution for the variance problem in generics. As you have seen recently, in C# creator of a generic type actually responsible to make it invariant/covariant/contravariant. This approach is known as definition-site variance annotations. On the other hand, in Java client of a generic type decides whether to treat it as invariant/covariant/contravariant. It is known as use-site variance annotations.

Consider the following code:

Java
interface AnimalFarm<T>
{
   T produceAnimal();
}

class CatFarm implements AnimalFarm<Cat>
{
   public Cat produceAnimal()
   {
       return new Cat();
   }
}

Now let's try to use it covariantly:

Java
AnimalFarm<Cat> catFarm = new CatFarm();
AnimalFarm<Animal> animalFarm = catFarm; //* compile-time error
Animal animal = animalFarm.produceAnimal();

Compiler does not allow it (line marked with *), because generic types are invariant by default in Java. However, we can "force" covariance for a generic type via wildcards. The next example works:

Java
AnimalFarm<Cat> catFarm = new CatFarm();
AnimalFarm<? extends Animal> animalFarm = catFarm; //OK
Animal animal = animalFarm.produceAnimal();

The nice thing about specifying variance on a "client-side" is that even if you have some problematic generic type (with generic type parameter presented both in input and output positions) you can still work with it in a covariant/ contravariant way. The bad thing about this approach is that variance is not incorporated into the design by a generic type creatror. Instead, client of this generic type should strain his brain thinking about how to use it properly.

Wildcard  <? extends Animal> means that animalFarm can hold an object of any AnimalFarm<T> type, with a generic type parameter(T) of Animal subtype. Obviously, Cat type parameter satisfies this condition.

Consider the following example:

Java
interface AnimalFarm<T>
{
   T produceAnimal();
   void feedAnimal(T animal);
}

class AnimalFarmDefault implements AnimalFarm<Animal>
{
   public Animal produceAnimal()
   {
       return new Animal();
   }

   public void feedAnimal(Animal animal)
   {
       //feed animal
   }
}

class CatFarm implements AnimalFarm<Cat>
{
   public Cat produceAnimal()
   {
       return new Cat();
   }

   public void feedAnimal(Cat animal)
   {
       //feed cat
   }
}

As you remember, in C# a similar generic type would be invariant, because the generic type parameter is presented both in input and output positions of the methods.

In Java by default it's invariant too, but using wildcards a client of a generic type can specify how to treat it.

You can treat it as covariant:

Java
AnimalFarm<Cat> catFarm = new CatFarm();
AnimalFarm<? extends Animal> animalFarm = catFarm; //OK
Animal animal = animalFarm.produceAnimal();

Or as contravariant:

Java
AnimalFarm<Animal> animalFarm = new AnimalFarmDefault();
AnimalFarm<? super Cat> catFarm = animalFarm; //OK
catFarm.feedAnimal(new Cat());

Wildcard  <? super Cat> means that catFarm can hold an object of any AnimalFarm<T> type with a generic type parameter(T) of Cat supertype. Surely, the Animal type parameter satisfies this condition.

The idea here is that when you treat a generic type as covariant you can only access methods where a generic type parameter is presented in output positions of the methods. And when you treat a generic type as contravariant you can only access methods where a generic type parameter is presented in input positions of the methods.

By the way, wildcard variance is not restricted to interfaces – you can use generic classes in a variance manner as well.

Scala

Consider the following example:

Java
trait AnimalFarm[T]
{
  def produceAnimal(): T
}

class CatFarm extends AnimalFarm[Cat]{
  def produceAnimal(): Cat = new Cat()
}

As you would expect, the below example does not compile, because by default generic types are invariant in Scala:

Java
val catFarm:AnimalFarm[Cat] = new CatFarm()
val animalFarm: AnimalFarm[Animal] = catFarm //Compile-time error
val animal: Animal = animalFarm.produceAnimal()

But you can make it covariant as follows:

Java
trait AnimalFarm[+T]
{
  def produceAnimal(): T
}

class CatFarm extends AnimalFarm[Cat]{
  def produceAnimal(): Cat = new Cat()
}

As you see, the only difference is "+" sign which indicates that the trait is covariant with respect to T type parameter. Now you can use covariance:

Java
val catFarm:AnimalFarm[Cat] = new CatFarm()
val animalFarm: AnimalFarm[Animal] = catFarm //OK
val animal: Animal = animalFarm.produceAnimal()

You see that it's similar to the approach used in C#. You specify that type is a covariant on a definition-site, not on a use-site like Java.

Similarly, you can make trait (or class) contravariant using "-" sign:

Java
trait AnimalFarm[-T]
{
  def feedAnimal(animal: T): Unit
}

class AnimalFarmDefault extends AnimalFarm[Animal]
{
  def feedAnimal(animal: Animal): Unit = {
    //feed animal
  }
}

And use it as contravariant:

Java
val animalFarm:AnimalFarm[Animal] = new AnimalFarmDefault()
val catFarm: AnimalFarm[Cat] = animalFarm //OK
catFarm.feedAnimal(new Cat())

All it looks like variance in C#, except the two important points: first of all, we are not restricted to traits – we can apply the same rules to Scala's classes. Another thing is that Scala provides more flexibility for managing generic constraints via Lower and Upper Bounds

Remember we discussed "problematic" example in C#, when generic type parameter was presented both in input and output positions. Let's reproduce the similar situation in Scala:

Java
trait AnimalFarm[T]
{
  def produceAnimal(): T
  def feedAnimal(animal: T): Unit
}

class AnimalFarmDefault extends AnimalFarm[Animal]{
  def produceAnimal(): Animal = new Animal()
  def feedAnimal(animal: Animal): Unit = {
    //feed animal
  }
}

class CatFarm extends AnimalFarm[Cat]{
  def produceAnimal(): Cat = new Cat()
  def feedAnimal(animal: Cat): Unit = {
    //feed animal
  }
}

AnimalFarm trait is invariant as well as a similar interface would be invariant in C#. And it can't be made, for example, covariant by simply adding "+" sign before the type parameter. We still need to deal with the fact that the type parameter is also presented in the input position of feedAnimal method, if we want to make the trait covariant. In C# we would've been forced to give up on our desire to make the interface covariant.

But in Scala we can do the following:

Java
trait AnimalFarm[+T]
{
  def produceAnimal(): T
  def feedAnimal[S >: T](animal: S): Unit
}

class AnimalFarmDefault extends AnimalFarm[Animal]{
  def produceAnimal(): Animal = new Animal()
  def feedAnimal[S >: Animal](animal: S): Unit = {
    //feed animal
  }
}

class CatFarm extends AnimalFarm[Cat]{
  def produceAnimal(): Cat = new Cat()
  def feedAnimal[S >: Cat](animal: S): Unit = {
    //feed animal
  }
}

And use it as covariant:

Java
val catFarm:AnimalFarm[Cat] = new CatFarm()
val animalFarm: AnimalFarm[Animal] = catFarm //OK
val animal: Animal = animalFarm.produceAnimal()
animalFarm.feedAnimal(new Dog) //still OK!

The cool thing here is that you can even call feedAnimal on a covariant type without compromising type safety. Let's explore how it works using the following method as an example:

Java
def feedAnimal[S >: Cat](animal: S): Unit = {
  //feed animal
}

The Lower bound (e.g. [S >: Cat]) specifies a reflexive relationship which means that you can pass to the method an object of any type (S) that is a supertype of Cat. If you pass a Cat object, then the common supertype between Cat and Cat is Cat itself, so S becomes Cat. If you pass Animal object, then the common supertype between Animal and Cat is Animal, so S becomes Animal. If you pass a Dog object, then the common supertype between Dog and Cat is Animal again, so S becomes Animal. Having this smart inference mechanism, the compiler can guarantee that type safety will never be compromised.

A brief summary of generics variance:

As you have seen, C# and Scala have the similar approach to deal with variance which is specified on a definition-site, though, there are essential differences in constraints rules and other divergent details. Java has another approach where variance is specified on a use-site via wildcards.

Strictly speaking, Scala also has use-site variance via existential types to cover Java's wildcards functionality and to facilitate few more interoperability problems. Even so, it's not a Scala's ideological way of solving a variance challenge. For Scala's architectures method of choice is to use definition-site variance. Still, you can find existential types variance examples in the attached archive/on GitHub.

Overriding: return type covariance

A language supports return type covariance if you can override a method from the base class (that returns a less-specific type) with a method in the derived class (that returns a more specific type).

C#

Consider the following example (does not compile in C#):

C#
class AnimalFarm
{
   public virtual Animal ProduceAnimal()
   {
      return new Animal();
   }
}

class CatFarm: AnimalFarm
{
    public override Cat ProduceAnimal() //compile-time error
    {
        return new Cat();
    }
}

As you see, return type covariance is not supported in C#. Moreover, it is not supported by the CLR itself. So, it's very unlikely that someday we'll see this feature in C#.

Java

Surprisingly (or not) the similar example works in Java:

Java
class AnimalFarm
{
   public Animal produceAnimal()
   {
       return new Animal();
   }
}

class CatFarm extends AnimalFarm
{
   @Override //OK
   public Cat produceAnimal()
    {
        return new Cat();
    }
}

Java supports return type covariance since JAVA 5.0.

Scala

In Scala it works as well:

Java
class AnimalFarm
{
  def produceAnimal(): Animal = new Animal()
}

class CatFarm extends AnimalFarm
{
  override def produceAnimal(): Cat = new Cat() //OK
}

No surprisingly, Scala supports return type covariance. Not surprisingly, because Scala is a JVM-based language compatible with Java.

Overriding: parameter type contravariance

A language supports parameter type contravariance if you can override a method from the base class (that has a parameter of more-specific type), with a method in the derived class (that has a parameter of less-specific type).

The all 3 languages do not support parameter type contravariance:

C#

C#
class AnimalFarm
{
   public virtual void FeedAnimal(Cat animal)
   {                    
   }
}

class CatFarm : AnimalFarm
{
   public override void FeedAnimal(Animal animal) //compile-time error
   {                    
   }
}

Java

Java
class AnimalFarm
{
   public void feedAnimal(Cat animal)
   {
   }
}

class CatFarm extends AnimalFarm
{
   @Override //compile-time error
   public void feedAnimal(Animal animal)
   {
   }
}

Scala

Java
class AnimalFarm
{
  def feedAnimal(animal: Cat)={
  }
}

class CatFarm extends AnimalFarm
{
  override def feedAnimal(animal: Animal)= { //compile-time error
  }
}

But why there is no support? What's wrong with parameter type contravariance? It seems that there is no harm can be done by supporting it. At first glance it is true. But the problem is that adding it to a language would create a bunch of controversial situations: e.g. how to distinguish overloading and overriding? A return type can be made covariant, because the return type is not considered during overloading: so there is no ambiguity. But method parameters are a part of the method signature and they are considered during overloading, so there could be potential ambiguity between overloading and overriding. There are more potential problems around, but it is the most apparent one.

Conclusion

As you have seen, variance is pretty interesting and (sometimes) a complicated thing. Language designers need to make a lot of compromises along the way: to enrich language to meet modern requirements and to deal with big legacy codebases at the same time. And it's very interesting to compare approaches that different languages use to solve similar problems. You can find all the examples in the attached archive and on GitHub. Thanks for attention. Till next time!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Igor Alekseev
Technical Lead UBS
Russian Federation Russian Federation
Currently Technical Lead at UBS

Comments and Discussions

 
-- There are no messages in this forum --