March 12, 2014

# What is Expression Problem and Why Should We Care?

Expression problem is about open extension:

The Expression Problem is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety (e.g., no casts).

Let’s break it down:

• Define a datatype by cases: A data type might have one case, such as an integer, but for now we are only interested in data types with multiple cases. Each case is a specialized instance of the same data type. For example Animal can be our data type and Cat & Dog can be its cases. A data type would be only useful if we could do something with it, so it is implied that there are functions over this data type.
• Add new cases: When a new case is added it needs to be compatible with all the existing functions defined over the data type.
• Add new functions: When a new function is added it must be applicable to all existing cases.
• Without recompiling: If we were allowed to go back and change existing code it wouldn’t be a challenging problem.
• Retain static type safety: I will interpret this as no monkeypatching. There seems to be some ambiguity in the original definition anyway[1].

## Shape Language

Let’s define a small language to demonstrate the expression problem. I will be using Clojure in the examples. Don’t worry if you don’t know Clojure very well. I will intentionally avoid using advanced language features, it should be fairly easy to read if you know a little bit about LISP notation. Our language deals with shapes:

```(defn square [x] (* x x))

(area (make-circle 1)) ; => 3.141592...

(area (make-square 2)) ; => 4
```

Our data type, le’t call it shape, have 2 cases, it can be a circle or a square. And we have an area function defined over shape. The question is; how would you implement make-circle, make-square & area and keep the property of open extension?

## Procedural Data Abstractions

In an object oriented language like Python or Java, our data type would become an interface and each case would be classes implementing this interface. Each class would then contain their versions of each function defined over the data type.

```(defn make-circle [radius]
(fn [observation]
(case observation

(defn make-square [edge]
(fn [observation]
(case observation
:edge edge
:area (square edge))))

(defn area [shape] (shape :area))
```

Here make-circle & make-square are functions that return functions. In other words shapes are represented as functions. This shouldn’t be important to the calling code, they should be indifferent to the internal representations of the data type. That’s the whole point of having an abstraction.

area accepts a shape, which is a function, and calls it with the parameter :area, within the implementation of shape this parameter is matched to find and invoke the actual implementation of area function for that case.

```(area (make-circle 1)) ; => 3.141592...

(area (make-square 2)) ; => 4
```

We can add a new case like this:

```(defn make-rectangle [a b]
(fn [observation]
(case observation
:a a
:b b
:area (* a b))))

(area (make-rectangle 0.5 2)) ; => 4.0
```

Calling area with a new case is no problem because the actual implementation is closed inside the function returned by shape constructors. All area needs to do is to call this closure with the :area parameter.

```(defn perimeter [shape] (shape :perimeter))

(perimeter (make-circle 1)) ; => NullPointerException

(perimeter (make-square 2)) ; => NullPointerException

(perimeter (make-rectangle 0.5 2)) ; => NullPointerException
```

Defining a new function over our data type is not possible with this pattern. In order to make it work we must go back and change each constructor, this would break without recompiling requirement.

Another drawback of this approach is it’s not easy to define functions that deal with more than one instance of the data type, of potentially different cases[2].

## Abstract Data Types

The language we have defined is in fact an abstract data type with two constructors and a function. In the previous section we have implemented this data type using procedural abstraction. A more straightforward approach would be to have the constructors close over the data only and move the computations into the observations:

```(defn make-circle [radius] {:type :circle :radius radius})

(defn make-square [edge] {:type :square :edge edge})

(defn area [shape]
(case (:type shape)
:circle (* Math/PI (square (:radius shape)))
:square (square (:edge shape))))
```

Here make-circle & make-square return associative arrays that contain only the data and under a special key :type what kind of shape is being created. This representation is not important for the callers, the language works exactly like the procedural version:

```(area (make-circle 1)) ; => 3.141592...

(area (make-square 2)) ; => 4
```

Bad news is; if we add a new constructor, it won’t be compatible with existing functions:

```(defn make-rectangle [a b] {:type :rectangle :a a :b b})

(area (make-rectangle 0.5 2)) ; => IllegalArgumentException: No matching clause: :rectangle
```

But this design lets us to define new functions:

```(defn perimeter [shape]
(case (:type shape)
:circle (* 2 Math/PI (:radius shape))
:square (* 4 (:edge shape))))

(perimeter (make-circle 1)) ; => 6.283185...

(perimeter (make-square 2)) ; => 8
```

In case it’s not obvious the reason we can extend cases/functions is not because the actual computations are contained within cases/functions. We can extend cases/functions because the other dimension (functions/cases) has a consistent interface. To put it differently the reason why we can’t add new cases/functions is because functions/cases are tightly coupled with actual operations. We’ll come back to this in the next post.

## Augment, Supplement & Enhance

Does your programming environment provide tools to overcome the limitations highlighted above? If it does, how smooth is it to add extensions? The significance of this is much more than getting a nerdy pleasure from a cool langulage feature. It’s about scaling the capabilities of the code with respect to the code size. It’s also about efficient code re-use.

It may not be immediately apparent from the particular data type I have chosen for the examples; but adding a case or a function can have an exponential effect on the capabilities of the data type. In some cases one case can be converted to another, conversion from a square to a rectangle for example. Or functions can have arities greater than one. Perhaps an intersection function.

What if you are using 3rd party code that you are not allowed to modify? Without open extension (in either dimension) you can only wrap existing functionality. Sometimes it is enough, but it can become a hassle. What if you are developing a library and want to provide the best value for its consumers? You can’t anticipate every possibility. You can’t know exactly which specific cases the caller will use. Even if you did, there would be conflicts. Your best bet is to provide a set of common cases and functions that does the heavy lifting. And leave the rest for the consumer to extend.

Even if all the code is in one repository you might still want to group extensions separately. If particular cases/functions are meaningful in certain parts of the codebase, it might make sense not adding them to the generic set of cases/functions. This is not a rule but something to consider.

I hope this post has managed to arouse some interest in the expression problem. You can run the code samples in a Clojure REPL. If you have Leiningen installed, just type lein repl in a shell. Feel free to shoot me an for any questions or comments.

In the next post we will build a system that allows adding new cases as well as new functions over the data type.