Motivation
Java is an opinionated programming language, in that it forces users to embrace Object Oriented Programming. Even after the introduction of lambdas in Java 8, it still suffers from a lack of abstractions commonly found in functional programming languages such as Scala and OCaml.
One of these abstractions is Algebraic Data Types â in particular tagged unions (also called sum types), which allow the definition of types that support several possible representations. Tagged unions are similar to Java enums, the only difference being that each of the cases can embed its own data.
Algebraic data types are usually associated with pattern-matching, another feature that programmers used to the functional style miss in Java. However, in this post I will only focus on describing a nice and simple trick that allows us to encode tagged unions in Java (working with versions of Java as old as Java 6 â perhaps even older), supporting exhaustiveness checks and meta cases, but without support for pattern matching.1
Note
This post is not intended as a serious proposal, and is merely having fun exploring non-standard ways to leverage Javaâs type system.
The functional way
In Scala, a good use case for algebraic data types would be the definition of a Command
type that can be only one Move
, Attack
, Pause
and Stop
(for example, in the implementation of a video game):
sealed trait Command
object Command {
case class Move(dest: Coord) extends Command
case class Attack(target: Entity) extends Command
case class Pause(time: Time) extends Command
case class Stop extends Command
}
sealed trait
simply defines an interface that will only be implemented by classes defined in the same module, and object Common
is just a way to package Command
-related classes together.
In Scala, a case class
is a normal Java class where Scala automatically provides implementations for toString
, hashCode
, equals
, copy
, apply
(used for construction without new
) and unapply
(used for extraction in pattern-matching), based on their argument members.
As we can see, each alternative class of type Command
contains its own data, irrelevant to the other classes. (Some data could also be shared in the Command
trait/class.)
Then, we can separately define a function for executing commands, as follows:
def execute(cmd: Command) = cmd match {
case Move(dest) => ... dest ...
case ... => ...
}
The object-oriented way
In Java, as per the official method advertised on Oracleâs website, this would typically be done via open inheritance and overriding methods:
public interface Command {
public abstract void execute();
public final static class Move implements Command {
public final Coord dest;
public Move(Coord dest) { this.dest = dest };
@Override
public void execute() { ... dest ... }
}
...
}
Although this approach may be appropriate in many cases, there are also many cases when it is not, a tension that Phil Wadler refers to as the expression problem. We will see in the next section what is the precise problem in our example.
Problems
In our context, the set of existing commands is seldom goind to be modified, so the class hierarchy is pretty much closed, and it is not designed to be extended (which is explicit in the sealed trait
Scala example).
The object-oriented approach suffers from a lack of flexibility with respect to usage of the Command
hierarchy. Indeed, we may use commands in many different contexts that commands themselves donât care about, or donât even know.
Think of the case of a multiplayer game where clients create commands which are serialized and sent to the server to be executed. In this extreme case, commands created on the client cannot possibly implement the execute
method, which requires access to resources only available on the server. In order work, the object-oriented approach would have to rely on more advanced design patterns like the visitor design pattern, which is significantly more clumsy to implement and inconveient to use than algebraic data types, as can be seen below:
public interface Command {
public static interface Visitor {
void visit(Move c);
void visit(Attack c);
void visit(Pause c);
void visit(Stop c);
}
public abstract void accept(Visitor v);
public final static class Move implements Command {
public final Coord dest;
public Move(Coord dest) { this.dest = dest; };
@Override
public void accept(Visitor v) { v.visit(this); }
}
public final static class Attack implements Command {
public final Entity target;
public Attack(Entity target) { this.target = target; };
@Override
public void accept(Visitor v) { v.visit(this); }
}
public final static class Pause implements Command {
public final Time duration;
public Pause(Time duration) { this.duration = duration; };
@Override
public void accept(Visitor v) { v.visit(this); }
}
public final static class Stop implements Command {
@Override
public void accept(Visitor v) { v.visit(this); }
}
}
Which is used this way:
public static void execute(Command cmd) {
cmd.accept(new Visitor() {
@Override
public void visit(Move c)
{ ... c.dest ... }
@Override
public void visit(Attack c)
{ ... c.target ... }
@Override
public void visit(Pause c)
{ ... }
@Override
public void visit(Stop c)
{ ... }
});
}
The trick in Java
The idea to encode tagged unions in Java comes from noticing that there is a strong similarity between exception handling and pattern matching (in fact, Scala and OCaml, for example, use their respective pattern matching syntaxes for catching exception).
In Java, any object extending Throwable
can be thrown and caught, not only exceptions. So what if we made our type alternatives Throwable
, and threw Command
objects just for the purpose of instance-matching them?
public abstract class Command extends Throwable {
public final static class Move extends Command {
public final Coord dest;
public Move(Coord dest) { this.dest = dest; };
}
public final static class Attack extends Command {
public final Entity target;
public Attack(Entity target) { this.target = target; };
}
public final static class Pause extends Command {
public final Time duration;
public Pause(Time duration) { this.duration = duration; };
}
public final static class Stop extends Command { }
}
The definitions are more concise than the version above that uses visitors, but more importantly, we can now write the execute
method with much less boilerplate:
public void execute(Command cmd) {
try { throw cmd; } // match the command
catch (Move c) { ... c.dest ... } // matches and use Move commands
catch (Attack c) { ... c.target ... } // matches and use Attack commands
catch (Command c) { ... } // default case; will match the remaining commands
}
Nicer syntax
In order to make the code more legible and less surprising, it is also possible to define a match
method with no other purpose than throwing the object itself:
public abstract class Command extends Throwable {
public abstract void match() throws Move, Attack, Pause, Stop;
public final static class Move extends Command {
public final Coord dest;
public Move(Coord dest) { this.dest = dest; };
@Override
public void match() throws Move { throw this; }
}
...
}
public void execute(Command cmd) {
try { cmd.match(); }
catch ( Move c ) { ... c.dest ... }
catch ( ... ) { ... }
}
Exhaustiveness
This approach isnât less type-safe than the visitor pattern, because thanks to its mechanism of checked exceptions,2 Java ensures exhaustiveness of our catch clauses. In the example above, if we omitted the default case catch (Command c)
, we would get compilation errors enjoining us to handle the Pause
and Stop
cases.
Meta cases
Using Java 7, it is even possible to merge case, much like it is done in functional pattern matching:
public static boolean isAction(Command cmd) {
try { cmd.match(); }
catch (Move | Attack c) { return true; }
catch (c) { } return false; // <- needs to be outside so the function has return statement
}
Alternatively, we can still use inheritance hierarchies to define meta-cases (ie: cases covering different sub-cases). Modifying class Command
:
public static abstract class Action extends Command {
public abstract void match() throws Move, Attack;
public final boolean isViolent()
{ try { cmd.match(); } catch (Attack c) { return true; }
catch (Move c) { } return false; }
}
public final static class Move extends Action { ... }
public final static class Attack extends Action { ... }
public final static class Pause extends Command { ... }
...
A usage example:
void warnOnViolentAction(Command cmd) {
try { cmd.match(); }
catch (Action c) { if (c.isViolent) System.out.println("Careful!"); }
catch (Command c) { System.out.println("We're safe... not even an action!"); }
}
We also get uselessness-checks for free, for example in:
try { ((Action) new Move(new Coord())).match(); } catch (...) { ... }
// ^ Java will ensure we don't try to catch anything else than an Action
Limitations
Genericity
Genericity is a cornerstone of typed functional languages, where the type system plays a central role in ensuring safety of the programs.
Unfortunately, Java has an ad-hoc restriction that prevents class Throwable
from being extended by any generic class. It means that there is no type-safe way to define Option<T>
in Java (an Option is a tagged union that is either None
â it contains nothing; or Some(v)
â it contains one value v
of type T
).
Note that the implementation problem motivating this restriction is more complicated than it may seem, because of type erasure happening when Java code is compiled: without any additional information, the JVM would be unable to differentiate between catching an Option<Integer>
or an Option<Boolean>
, which would result in unsafety.
In fact, this is exactly one of the biggest problems in the current implementation of Scala, where one can only get around it using cumbersome type system hacks.
Workaround: throwing tokens (witnesses)
It is possible to mitigate the restriction by giving up some type safety: instead of throwing the Option<T>
directly, we can throw a special âtokenâ class identifying the current alternative. However, these tokens should not be kept around after the scope exits, or it will result in usafety. Similarly, there is a risk that tokens be mixed in nested pattern matchings.
Here is how we could define Option<T>
(assuming Java 8 for the map
function):
package common;
import java.util.function.Function;
public abstract class Option<T> {
private Option(){} // prevents others from extending Option with more alternatives
public static class NoneToken extends Throwable { private NoneToken(){} }
public static class SomeToken extends Throwable { private SomeToken(){} }
public void match() throws NoneToken, SomeToken
{ if (this instanceof None) throw new NoneToken(); else throw new SomeToken(); }
public T get(SomeToken t) { return ((Some<T>)this).value; }
public<U> Option<U> map(Function<T, U> f) {
try { match(); }
catch (SomeToken t) { return some(f.apply(get(t))); }
catch (NoneToken t) { }
return none;
}
public static class Some<T> extends Option<T> {
public final T value;
public Some(T value) { this.value = value; }
}
public static class None extends Option { }
public final static None none = new None();
public static<T> Some<T> some(T value) { return new Some(value); }
}
Notice that this workaround is essentially a visitor pattern in disguise, where the dynamic dispatch part is implemented by exception handling instead of virtual dispatch, and the token serves the purpose of statically selecting the right visitor method overload (here, there is only get
â but there could be more).
Following is a usage example:
import common.Option;
import static common.Option.*;
import static java.lang.System.out;
public class Main {
public static void main(String[] args) {
Option<Integer> opti = some(42);
foo(opti);
foo(some(666));
foo(none);
foo(opti.map(x -> x+1));
foo(opti.map(x -> x.toString().length()));
}
static<T> void foo(Option<T> opt) {
try { opt.match(); }
catch (SomeToken t) { out.println(opt.get(t)); }
catch (NoneToken t) { out.println("???"); }
}
}
Performance
Although I have not tested the performances of this pattern, I expect that it should be significantly slower than the object-oriented pattern based on inheritance. An important optimization would be to avoid reconstructing Throwable
objects, so that they need not store a new stack trace every time they are created (one of the reasons why exceptions are slow). This is trivial in Option<T>
, where the Throwable
s are tokens, but seems more difficult to achieve for cases like Command
.
Therefore, this pattern is obviously not appropriate in the hot paths and tight loops of a program. But in the case of a Command
type for a multiplayer video game, where the hot paths are likely not in the processing of the few commands received from clients, it would seem perfectly reasonable.
Conclusion
Because of its expectedly poor performances, and because of restrictions with genericity, this (anti?)pattern is mostly a toy idea. However, for cases like Command
, I still find it well-suited, and it almost gives a rudimentary functional feel to pre-lambda Java code.
Of course, because it is totally non-standard, it is not intended to be used in any serious project without careful documentation and the approval of other team members.
-
Since the introduction of lambda-expressions in Java 8, it has become possible to encode some forms of pattern matching, such as in this blog post ↩
-
who whould have believed anyone would ever thank checked expressions? ↩