𝐖ell-𝚻yped 𝐑eflections

Motivation

Java is an opinionated programming language, in that it forces users to embrace Object Oriented Programming. Even after the introduction of lambdas in Java 8, it still suffers from a lack of abstractions commonly found in functional programming languages such as Scala and OCaml.

One of these abstractions is Algebraic Data Types – in particular tagged unions (also called sum types), which allow the definition of types that support several possible representations. Tagged unions are similar to Java enums, the only difference being that each of the cases can embed its own data.

Algebraic data types are usually associated with pattern-matching, another feature that programmers used to the functional style miss in Java. However, in this post I will only focus on describing a nice and simple trick that allows us to encode tagged unions in Java (working with versions of Java as old as Java 6 – perhaps even older), supporting exhaustiveness checks and meta cases, but without support for pattern matching.¹

Note

This post is not intended as a serious proposal, and is merely having fun exploring non-standard ways to leverage Java’s type system.

The functional way

In Scala, a good use case for algebraic data types would be the definition of a Command type that can be only one Move, Attack, Pause and Stop (for example, in the implementation of a video game):

	sealed trait Command
	object Command {
		case class Move(dest: Coord) extends Command
		case class Attack(target: Entity) extends Command
		case class Pause(time: Time) extends Command
		case class Stop extends Command
	}

sealed trait simply defines an interface that will only be implemented by classes defined in the same module, and object Common is just a way to package Command-related classes together.

In Scala, a case class is a normal Java class where Scala automatically provides implementations for toString, hashCode, equals, copy , apply (used for construction without new) and unapply (used for extraction in pattern-matching), based on their argument members.

As we can see, each alternative class of type Command contains its own data, irrelevant to the other classes. (Some data could also be shared in the Command trait/class.)

Then, we can separately define a function for executing commands, as follows:

	def execute(cmd: Command) = cmd match {
		case Move(dest) => ... dest ...
		case    ...     =>   ...
	}

The object-oriented way

In Java, as per the official method advertised on Oracle’s website, this would typically be done via open inheritance and overriding methods:

	public interface Command {

		public abstract void execute();

		public final static class Move implements Command {
			public final Coord dest;
			public Move(Coord dest) { this.dest = dest };
			@Override
			public void execute() { ... dest ... }
		}

		...
	}

Although this approach may be appropriate in many cases, there are also many cases when it is not, a tension that Phil Wadler refers to as the expression problem. We will see in the next section what is the precise problem in our example.

Problems

In our context, the set of existing commands is seldom goind to be modified, so the class hierarchy is pretty much closed, and it is not designed to be extended (which is explicit in the sealed trait Scala example).

The object-oriented approach suffers from a lack of flexibility with respect to usage of the Command hierarchy. Indeed, we may use commands in many different contexts that commands themselves don’t care about, or don’t even know.

Think of the case of a multiplayer game where clients create commands which are serialized and sent to the server to be executed. In this extreme case, commands created on the client cannot possibly implement the execute method, which requires access to resources only available on the server. In order work, the object-oriented approach would have to rely on more advanced design patterns like the visitor design pattern, which is significantly more clumsy to implement and inconveient to use than algebraic data types, as can be seen below:

	public interface Command {

	    public static interface Visitor {
	        void visit(Move c);
	        void visit(Attack c);
	        void visit(Pause c);
	        void visit(Stop c);
	    }

	    public abstract void accept(Visitor v);

	    public final static class Move implements Command {
	        public final Coord dest;
	        public Move(Coord dest) { this.dest = dest; };

	        @Override
	        public void accept(Visitor v) { v.visit(this); }
	    }

	    public final static class Attack implements Command {
	        public final Entity target;
	        public Attack(Entity target) { this.target = target; };

	        @Override
	        public void accept(Visitor v) { v.visit(this); }
	    }

	    public final static class Pause implements Command {
	        public final Time duration;
	        public Pause(Time duration) { this.duration = duration; };

	        @Override
	        public void accept(Visitor v) { v.visit(this); }
	    }

	    public final static class Stop implements Command {
	        @Override
	        public void accept(Visitor v) { v.visit(this); }
	    }
	}

Which is used this way:

	    public static void execute(Command cmd) {
	        cmd.accept(new Visitor() {
	            @Override
	            public void visit(Move c)
	            {  ... c.dest ... }

	            @Override
	            public void visit(Attack c)
	            {  ... c.target ... }

	            @Override
	            public void visit(Pause c)
	            { ... }

	            @Override
	            public void visit(Stop c)
	            { ... }
	        });
	    }

The trick in Java

The idea to encode tagged unions in Java comes from noticing that there is a strong similarity between exception handling and pattern matching (in fact, Scala and OCaml, for example, use their respective pattern matching syntaxes for catching exception).

In Java, any object extending Throwable can be thrown and caught, not only exceptions. So what if we made our type alternatives Throwable, and threw Command objects just for the purpose of instance-matching them?

	public abstract class Command extends Throwable {

	    public final static class Move extends Command {
	        public final Coord dest;
	        public Move(Coord dest) { this.dest = dest; };
	    }

	    public final static class Attack extends Command {
	        public final Entity target;
	        public Attack(Entity target) { this.target = target; };
	    }

	    public final static class Pause extends Command {
	        public final Time duration;
	        public Pause(Time duration) { this.duration = duration; };
	    }

	    public final static class Stop extends Command { }
	}

The definitions are more concise than the version above that uses visitors, but more importantly, we can now write the execute method with much less boilerplate:

	public void execute(Command cmd) {
		try { throw cmd; }                     // match the command
		catch (Move c)    { ... c.dest ... }   // matches and use Move commands
		catch (Attack c)  { ... c.target ... } // matches and use Attack commands
		catch (Command c) { ... }              // default case; will match the remaining commands
	}

Nicer syntax

In order to make the code more legible and less surprising, it is also possible to define a match method with no other purpose than throwing the object itself:

	public abstract class Command extends Throwable {
	    public abstract void match() throws Move, Attack, Pause, Stop;

	    public final static class Move extends Command {
	        public final Coord dest;
	        public Move(Coord dest) { this.dest = dest; };
	        @Override
	        public void match() throws Move { throw this; }
	    }

	    ...
	}

	public void execute(Command cmd) {
		try { cmd.match(); }
		catch ( Move c ) { ... c.dest ... }
		catch (  ...   ) { ... }
	}

Exhaustiveness

This approach isn’t less type-safe than the visitor pattern, because thanks to its mechanism of checked exceptions,² Java ensures exhaustiveness of our catch clauses. In the example above, if we omitted the default case catch (Command c), we would get compilation errors enjoining us to handle the Pause and Stop cases.

Meta cases

Using Java 7, it is even possible to merge case, much like it is done in functional pattern matching:

	public static boolean isAction(Command cmd) {
		try { cmd.match(); }
		catch (Move | Attack c) { return true; }
		catch (c) { } return false; // <- needs to be outside so the function has return statement
	}

Alternatively, we can still use inheritance hierarchies to define meta-cases (ie: cases covering different sub-cases). Modifying class Command:

	public static abstract class Action extends Command {
		public abstract void match() throws Move, Attack;

		public final boolean isViolent()
		{ try { cmd.match(); } catch (Attack c) { return true;  }
		                       catch (Move c) { } return false; }

	}
	public final static class Move extends Action { ... }
	public final static class Attack extends Action { ... }

	public final static class Pause extends Command { ... }
	...

A usage example:

	void warnOnViolentAction(Command cmd) {
		try { cmd.match(); }
		catch (Action c) { if (c.isViolent) System.out.println("Careful!"); }
		catch (Command c) { System.out.println("We're safe... not even an action!"); }
	}

We also get uselessness-checks for free, for example in:

	try { ((Action) new Move(new Coord())).match(); } catch (...) { ... }
	// ^ Java will ensure we don't try to catch anything else than an Action

Limitations

Genericity

Genericity is a cornerstone of typed functional languages, where the type system plays a central role in ensuring safety of the programs.

Unfortunately, Java has an ad-hoc restriction that prevents class Throwable from being extended by any generic class. It means that there is no type-safe way to define Option<T> in Java (an Option is a tagged union that is either None – it contains nothing; or Some(v) – it contains one value v of type T).

Note that the implementation problem motivating this restriction is more complicated than it may seem, because of type erasure happening when Java code is compiled: without any additional information, the JVM would be unable to differentiate between catching an Option<Integer> or an Option<Boolean>, which would result in unsafety.

In fact, this is exactly one of the biggest problems in the current implementation of Scala, where one can only get around it using cumbersome type system hacks.

Workaround: throwing tokens (witnesses)

It is possible to mitigate the restriction by giving up some type safety: instead of throwing the Option<T> directly, we can throw a special “token” class identifying the current alternative. However, these tokens should not be kept around after the scope exits, or it will result in usafety. Similarly, there is a risk that tokens be mixed in nested pattern matchings.

Here is how we could define Option<T> (assuming Java 8 for the map function):

	package common;
	import java.util.function.Function;

	public abstract class Option<T> {
	    private Option(){} // prevents others from extending Option with more alternatives

	    public static class NoneToken extends Throwable { private NoneToken(){} }
	    public static class SomeToken extends Throwable { private SomeToken(){} }

	    public void match() throws NoneToken, SomeToken
	    { if (this instanceof None) throw new NoneToken(); else throw new SomeToken(); }

	    public T get(SomeToken t) { return ((Some<T>)this).value; }

	    public<U> Option<U> map(Function<T, U> f) {
	        try { match(); }
	        catch (SomeToken t) { return some(f.apply(get(t))); }
	        catch (NoneToken t) { }
	        return none;
	    }

	    public static class Some<T> extends Option<T> {
	        public final T value;
	        public Some(T value) { this.value = value; }
	    }
	    public static class None extends Option { }

	    public final static None none = new None();
	    public static<T> Some<T> some(T value) { return new Some(value); }
	}

Notice that this workaround is essentially a visitor pattern in disguise, where the dynamic dispatch part is implemented by exception handling instead of virtual dispatch, and the token serves the purpose of statically selecting the right visitor method overload (here, there is only get – but there could be more).

Following is a usage example:

	import common.Option;
	import static common.Option.*;
	import static java.lang.System.out;

	public class Main {
	    public static void main(String[] args) {
	        Option<Integer> opti = some(42);

	        foo(opti);
	        foo(some(666));
	        foo(none);

	        foo(opti.map(x -> x+1));
	        foo(opti.map(x -> x.toString().length()));
	    }

	    static<T> void foo(Option<T> opt) {
	        try { opt.match(); }
	        catch (SomeToken t) { out.println(opt.get(t)); }
	        catch (NoneToken t) { out.println("???"); }
	    }
	}

Performance

Although I have not tested the performances of this pattern, I expect that it should be significantly slower than the object-oriented pattern based on inheritance. An important optimization would be to avoid reconstructing Throwable objects, so that they need not store a new stack trace every time they are created (one of the reasons why exceptions are slow). This is trivial in Option<T>, where the Throwables are tokens, but seems more difficult to achieve for cases like Command.

Therefore, this pattern is obviously not appropriate in the hot paths and tight loops of a program. But in the case of a Command type for a multiplayer video game, where the hot paths are likely not in the processing of the few commands received from clients, it would seem perfectly reasonable.

Conclusion

Because of its expectedly poor performances, and because of restrictions with genericity, this (anti?)pattern is mostly a toy idea. However, for cases like Command, I still find it well-suited, and it almost gives a rudimentary functional feel to pre-lambda Java code.

Of course, because it is totally non-standard, it is not intended to be used in any serious project without careful documentation and the approval of other team members.

Since the introduction of lambda-expressions in Java 8, it has become possible to encode some forms of pattern matching, such as in this blog post ↩
who whould have believed anyone would ever thank checked expressions? ↩

Programming (8)

Well-Typed Reflections

Recent Posts

Links

Algebraic Data Types in Java 6