covariance | Mind-driven development

Hey guys. Sorry that I was not posting bigger stuff for a long time. I’ve had some personal time-eaters during the last months, but I hope it will get better soon. Today’s topic during this article is covariance and contravariance in general and especially in C#. What is it? Who needs it? How is it supported today? How will it be supported in C# 4.0? Answers have to be found…

So, covariance and contravariance are features of a programming language’s type system (not limited to C# at all). More precisely (and theoretically), they are characteristics of binary operators, which convert one type to another type. Generally if an operator changes the type in some kind, it’s called variant. If it retains the type (the type is fix), then it’s called invariant. Two types of variant operators are contravariant and covariant operators. In fact, there are other types of variance (consider dynamic typing, duck typing etc.), thus covariant/contravariant operators are just two examples for that. If an operator orders types in the way from general to more specific for any type, then it is called to be covariant. If it orders types reversely from specific to more general, then it’s contravariant.

So far the theoretical introduction. It’s pretty much saying nothing about what both terms mean in your practical coding life, isn’t it? Well… in programming languages there are two features where co- and contravariance can make sense: return type covariance and parameter type contravariance.

Return type covariance

Let’s consider the interface IClonable for a moment. It’s declared as:

public interface ICloneable
{
    object Clone();
}

If we want a class Foo to implement this interface, then it has to apply the exact same signature:

public class Foo : ICloneable
{
    public object Clone()
    {
        Foo clone = new Foo();
        // cloning implementation
        // ...

        return clone;
    }
}

Wouldn’t it be nice to have a method public Foo Clone() instead? That’s what return type covariance means. With this, the return type would be narrowed by more specific implementations or derivations. It would give type-safety to all callers of the method. Yes, this would be nice, indeed. But that’s a feature, which is currently not implemented in the C# language. The more general return type in the interface or a base class can not be narrowed on implementing or deriving classes. The same problem arises in a number of cases and on generics, as we will see later on.

Parameter type contravariance

If we got a method on an interface or base class which takes a parameter X, then contravariance is all about wanting a more general parameter in the implementation method of that interface or in a derived class. C# doesn’t support that, yet. For example, the following will not work:

class Fruit { }
class Apple : Fruit { }
class Cherry : Fruit { }
class FlavorComparer : IComparer<Fruit> { ... }

public void SortApples()
{
    List<Apple> apples = GetSomeApples();
    IComparer<Fruit> flavorComparer = new FlavorComparer();
    apples.Sort(flavorComparer);
}

FlavorComparer compares two fruits based on their flavor. Apples are fruits, so why shouldn’t they be sorted with use of the FlavorComparer? This would be nice, but Sort() needs a IComparer<Apple> and not the more general argument IComparer<Fruit>. It should be no problem for using the more general comparer, but C# doesn’t support this at the moment.

Covariance on arrays

Since the first C# (1.0), covariance on arrays is supported. Thus, one could write:

Fruit[] fruits = new Apple[10];
fruits[0] = new Cherry();

An array of a reference type can be viewed as an array of its base type or any implemented interface. That’s no problem for the compiler, it just works. But at runtime, that example would produce an ArrayTypeMismatchException and that’s what one would expect in this example. Creating apples and setting one to a cherry is no good idea.

Well, but why does it compile? That’s a historical problem. In the first version of C#, it has been designed to attract as many developers as possible and since this type of covariance worked in Java, the C# guys applied it to their language as well. Not that good idea, but that’s another story…

Covariance and contravariance on generics

Generics in C# 1.0 to 3.0 are invariant (fix). If we map the array example to generics, then the following code will not work:

List<Fruit> fruits = new List<Apple>();
fruits.Add(new Cherry());

In this case, there’s no exception thrown at runtime, but already the compiler comes up with an error message at the first line. Invariance on generics is a serious limitation of current C# versions, since it doesn’t allow many intuitively correct cases.

For example, if you have a method List<Fruit> GetFruits() on the Fruit class, which gets overridden by the Apple class, then you can’t return an instance of List<Apple>, which should be no problem intuitively. The following code on Apple is not working:

public List<Fruit> GetFruits()
{
    List<Apple> fruits = GetApples();
    return fruits;
}

The same problem arises, if we want to have a list of fruits, which should be a Union of apples and cherries, which are both fruits…

List<Fruit> fruits = new List<Fruit>();
List<Cherry> cherries = GetCherries();
List<Apple> apples = GetApples();

fruits = fruits.Union(cherries).Union(apples);

This or similar code doesn’t work as well.

C# 4.0

Having all that in mind, C# 1.0-3.0 is some kind of nerving limited in terms of covariance and contravariance. It has covariance on arrays of reference types and C# 2.0 introduced covariance and contravariance on delegates (which obviates some really annoying workarounds), but at the whole there should be better language support of those type-system features.

But a knight in shining armour is approaching: C# 4.0 arises on the horizon! But does it hold what it looks like? In fact, with C# 4.0 you have the in and out keywords, which you can define on type arguments of interfaces and delegates.

With the out keyword you can define a type parameter to be covariant. This means, that implementations of methods which return an object of this type can also return a more specific type. The standard example in this case is IEnumerable<out T>:

public interface IEnumerable<out T> : IEnumerable
{
    IEnumerator<T> GetEnumerator();
}

public interface IEnumerator<out T> : IDisposable, IEnumerator
{
    T Current { get; }
}

This allows you to write e.g.:

IEnumerable<object> list = new List<string>();

and

IEnumerable<Fruit> fruits = new List<Apple>();

It allows you to have a base type as type parameter of an IEnumerable instance and a more specific type if you assign something to it. But wait: there’s a limitation on that… out T means: no method in this interface is allowed to use T as parameter! The word „out“ makes this clear and it limits the use of those covariance to just a few interfaces. For example, this cannot be set on the ICollection<T> interface (and furthermore on IList<T>, because of IList<T> : ICollection<T>). ICollection<T> includes a method Add(T item) and since no T is allowed to be used when out T is declared, that wouldn’t work. Now, most of my own generic interfaces and many .NET interfaces as well use type parameters as in- and output, so this new type of covariance seems to be not this influencing. For example, the following code would not work:

IList<object> list = new List<string>();

But why? There’s an important aspect on that and that is the same as on the array example in the paragraph „Covariance on arrays“. Having the Add() method, one could add an instance of type object to the list and that has not to be allowed:

IList<object> list = new List<string>();
list.Add(new object());  // mustn't be allowed

Sure, that’s a big limitation, but it makes sense and is absolutely necessary!

The same comes true for the in keyword. One can define an interface with type parameter in T. With that, T can only be used as type parameter and not as return value. But this brings contravariance on the type parameters to life. Having the FlavorComparer example from above in mind? That will work, if IComparer<in T> is declared by the .NET framework. The limitation of T to be set as method parameter only is necessary due to the same aspect as for the out keyword. If IList<in T> should have been declared and one defines a variable IList<string> list and calls list.Add(new object()), then it would make no sense to iterate through all list elements as string, because the inserted object is no string and would cause problems.

So in the end, the knight’s armour is not that shiny. C# 4.0 comes with some nice covariance and contravariance features, but the out and in parameters are fairly limited (and using them together is (luckily) not allowed). Many people are annoyed about that, but I don’t share their opinion. It’s a good thing to have covariance and contravariance on type parameters separated from each other! Else this would produce heavy problems as we have seen in the examples above and in the array covariance section. My opinion is, that the language designers have done a good job here. I don’t want to have some kind of ArrayTypeMismatchException on generics and thus in/out on type parameters are a good thing.

Links:

Schlagwort: covariance

C# covariance and contravariance

Return type covariance

Parameter type contravariance

Covariance on arrays

Covariance and contravariance on generics

C# 4.0