Using Java Generics to express variance of Collections and Functions

Post about contra-variant and co-variant types in Java with Generics illustrated with Generic Collections and Functions

Dávid Csákvári

7 mins

I've recently came across Richard Warburton's presentation called Generics and Java's Evolution, where he explains how to use generic types in different scenarios and introduces some concepts that hopefully will be available in Java 10.

It's a great talk, and it inspired me to spend some time playing with generic types in Java, going down the rabbit hole of the co-variant and contra-variant parameterized types.

Lacking imagination, I use the classic Pet example for subtyping in the code snippets for this post. Dog and Cat are subtypes of Pet, so when an expression of Pet type is required, a Dog or a Cat can be used as well. Variance refers to subtyping of complex types created from simple types, such as:

is an array of Dogs a subtype of an array of Pets?
what about Collections?
how do similar functions relate to each other, that, for example take a Cat and a Pet as an argument?

Arrays

Because early versions of Java did not include generics, for practical reasons arrays were made to be co-variant.

This means if we can substitute A for B then we can substitute A[] for B[] as well.

This is good, because we can pass various types of arrays to the same method, like passing an array of Dogs to a method that expect an array of Pets of any kind.

Dog[] dogs = { new Dog() };
Pet[] pets = dogs;

If we don't add new elements to the pets array, there is no problem, but writing to it can be problematic in this case.

The code below compiles fine, but when executed it throws an ArrayStoreException because we are trying to add a Cat to an array of Dogs (in a line, where the type information says they are just Pets).

Dog[] dogs = { new Dog() };
Pet[] pets = dogs;
pets[0] = new Cat(); // ArrayStoreException

Throwing an Exception in this case is really good, compared to the other alternative, to let the problem show it's ugly face when we would read the original dogs array, and expect all elements to be a Dog.

for (int i = 0; i < dogs.length; i++) {
	// This would be a problem, as dogs[0] is not a Dog, but a Cat.
	Dog dog = dogs[i];
	System.out.println(dog.bark());
}

To summarize, arrays are co-variant, so writing to them can cause problems, and the Java type system can't prevent it from happening.

Generics

Generic types appeared in Java 5, and one of their most popular use-case is Generic Collections. Unlike simple arrays, the different parameterizations of generic types in Java are invariant by default. For example, a List<Pet> is a totally different thing than List<Dog>, they can not be substituted.

So, the code below does not compile:

List<Dog> dogs = new ArrayList<Dog>();
List<Pet> pet = dogs; // This line does not compile

The variance of the type parameter can be specified when using a generic type. (This is called use-site variance. Other languages, such as Scala support declaration-site variance that allows the creator of the generic type to specify the variance of it's type arguments.)

Co-variance

With '? extends' the type parameter can be marked to be co-variant. This can make the substitution work for Lists just as it would with simple arrays.

List<Dog> dogs = new ArrayList<Dog>();
List<? extends Pet> covariantList = dogs;

The difference is that it's co-variant nature is encoded in it's type.

It indicates that like in the case with arrays, we can read Pets from the List without a problem, because we know for sure that what we get is at least a Pet.

Pet pet = covariantList.get(0); // OK

But we don't know what other conditions must be fulfilled. The parameter indicates that the type of the List is unknown, and it's upper bound is Pet. We can't write any kind of Pet to this List to satisfy it's static type constraints, as this unknown type might be Dog, Cat, but it could be something more exotic, that we might not even heard of. Octopus, Shark, or a HairySpider.

covariantList.add(new Pet()); // This does not compile.
covariantList.add(new Dog()); // This does not compile.

Compared to simple arrays, using this co-variant List turns the runtime error encountered before to a compile time one.

Contra-variance

Similarly to co-variance, with '? super' we can declare a contra-variance.

List<Pet> pets = new ArrayList<Pet>();
List<? super Pet> contravariantList = pets;

A contravariantList is a list of something that is a superclass of Pet. As with covariance, the question mark represents that the exact type of something is unknown, Pet is it's lower bound is in this case. This means the type might be Pet, or any of it's ancestor types like Creatures, but it simply can be Object as well. So while we can read elements from the list, but the typechecker can not guarantee anything about their types.

Pet pet = contravariantList.get(0); // This does not compile.

But we can write new Pet entries to it, because we know that the unknown type guaranteed to be substitutable with Pet.

contravariantList.add(new Pet()); // OK
contravariantList.add(new Dog()); // OK
contravariantList.add(new Cat()); // OK

To summarize, it is safe to read from a co-variant List, but we can't write to it, while it is safe to write a contra-variant List, but reading items from it can be problematic.

The asymmetry is caused by the same type parameter (E) on the input and output types of the methods. It drives the minimum type requirements of the object to be passed into it as an argument, and the minimum guarantees about the object's type it returns.

E get(int index); // I give you at least an E, but you can expect less.
boolean add(E e); // I need at least an E, but you can give more.

Java 8 lambdas and method references

Consider the following short example that illustrates co-variant types from the previous section:

static void example() {
	Consumer<Pet> petConsumer = pet -> System.out.println(pet);
	Consumer<Object> objectConsumer = System.out::println;

	breed(petConsumer); // OK
	breed(objectConsumer);  // Compile error
}

static void breed(Consumer<Pet> consumer) {
	consumer.accept(new Pet());
	consumer.accept(new Dog());
}

Of course, because of breed's argument, Consumer<Pet> is invariant with the supplied Consumer<Object> parameter, there is a compile error.

Because the System.out.println method takes an Object, and returns nothing, at first I thought that the type of the System.out::println expression is equivalent to Consumer<Object>. However, if we pass it directly to the breed method, it compiles fine.

	breed(System.out::println); // OK

This is because Java threats the substitution of lambda expressions and method references differently from simple parameterized generic types. See chapters 15.13.2 Type of a Method Reference and 15.27.3 Type of a Lambda Expression from the The Java Language Specification Java SE 8 Edition, they have a lot in common.

For example, consider the snippet below:

public static EmailMessage transform (Message m) {return new EmailMessage();}

public static void methodReferenceExamples () {
	Function<Message, Message> a = CovarianceExample::transform;
	Function<Message, EmailMessage> b = CovarianceExample::transform;
	Function<EmailMessage, Message> c = CovarianceExample::transform;
	Function<EmailMessage, EmailMessage> d = CovarianceExample::transform;
}

Although for some of the assignments the referred method requires different types for the argument or the return type than the parameterization of corresponding Functional Interface indicate, all assignment compile fine. In every case the argument type parameter of the Functional Interface requires a Message, or a more specific EmailMessage, which is fine, because the transform method expect at least a Message. There is no harm giving something more specific to it.

The return types are aligned well too, the transform method provides an EmailMessage which is exactly what's needed, or even more specific.

Of course, providing less, or requiring more would result in a compile error.

public static Message transform (EmailMessage m) {return new Message();}

public static void methodReferenceExamples () {
	Function<Message, EmailMessage> iAmTooDemanding = CovarianceExample::transform; // Compile error. 
	Function<Message, Message> iAmNotGivingYouEnough = CovarianceExample::transform; // Compile error.
}

Java 8 Lambda expressions have similar behaviour if we don't specify the argument types, and let the compiler infer them for us.

	breed(s -> System.out.println(s)); // OK

So, a method or a function defined by a lambda expression is co-variant in its return value and contra-variant in its arguments. This convenience aids functional programming in Java very well.

Similar rule applies for method overloading, but the argument types in that case must match exactly, possibly to aid method resolution.