The Y combinator is a central concept in lambda calculus, which is the formal foundation of functional languages. Y allows one to define recursive functions without using self-referential definitions.

Most articles I’ve seen dedicated to explaining the Y combinator start by showing you *the Y combinator*, which in itself is fairly inscrutable, and then trying to explain why it works.

I think this is backwards. In this article, we’ll go the other way around: I’ll start by describing, in *simple* terms, ** the essence of the Y combinator** — or how to make recursion work without recursive definitions, and from there I will derive the usual Y combinator.

Since we’ll be using Python, I’m actually going to describe a slight variation of the Y combinator, called the Z combinator. That’s because Y proper only work in languages with non-strict evaluation (like Haskell, which uses lazy evaluation).

#### Background

I gave a lecture on recursion and term equivalence in the lambda calculus for the Foundations of Software course at EPFL. This article recounts what I came up with to explain the Z combinator in simple, intuitive terms. It is inspired by the excellent “The Why of Y” writeup by Richard P. Gabriel, available here. I make essentially the same argument here, but slightly simplified and in Python (instead of Scheme).

### Recursive definitions in Python

Here is a typical recursive definition, the factorial function:

```
def fact(x):
if x == 0: return 1
else: return fact(x - 1) * x
```

To see how it works, consider that `fact(3)`

computes `fact(2) * 3`

which is `fact(1) * 2 * 3 == fact(0) * 1 * 2 * 3 == 1 * 1 * 2 * 3 == 6`

.

For brevity, we’ll write it in a style that’s a bit more functional:

```
def fact(x):
return 1 if x == 0 else fact(x - 1) * x
```

And for regularity with the rest of this article, let us rewrite that definition using Python’s syntax for function literals (lambda expressions):

```
fact = (lambda x:
1 if x == 0 else fact(x - 1) * x)
```

which works similarly as the `def`

above:

```
print(fact(3)) # prints 6
```

#### Problem

Lambda calculus, which is the core of functional languages, only supports function literals and function application. So the definition above is not a valid lambda term — indeed, it involves a recursive binding: the definition of `fact`

refers to itself!

Note that non-recursive (immutable) local variable bindings are fine, on the other hand: a program of the form `x = value; ... x ...`

, where `value`

does not refer to `x`

, can always be reformulated as `(lambda x: ... x ...)(value)`

, a proper lambda term.

### Recursion without direct self reference

Our goal is to rewrite a given recursive function into a “plain” function literal — that is, without using a recursive binding. But how is that possible? After all, for the function to be able to call itself recursively, it will *need* to refer to itself one way or the other…

The trick is to make the self reference necessary for recursion *indirect*. Indeed, it out turns to be possible for a non-recursive function `f`

to refer to itself simply by passing it… `f`

(itself) in argument!

For instance, consider the function `f`

below, which prints `"in f"`

before calling its argument function `g`

, passing it a dummy lambda:

```
def f(g):
print("in f")
g(lambda x: None)
```

The expression `f(lambda h: print "in g")`

simply prints `"in f"`

as expected and then calls the lambda we passed as `g`

, which in turn prints `"in g"`

.

Now, let us consider what `f(f)`

does. Following the same logic as above, this code prints `"in f"`

*twice*… in effect, **we have made f call itself** “under the guise” of a parameter

`g`

.#### Making recursion happen

Now, consider the following variation, which calls its parameter function `g`

with `g`

itself as the argument, and returns the result wrapped in a parameterless lambda (to prevent potential infinite loops):

```
def f(g):
print("in f")
return lambda: g(g)
```

Calling `f(f)`

prints `"in f"`

and then returns a parameterless lambda which, when applied… calls `f(f)`

again! Therefore, calling `f(f)()`

prints `"in f"`

twice, and calling `f(f)()()()`

prints `"in f"`

four times.

Essentially, the result of `f(f)`

corresponds to the following explicitly-recursive function:

```
def ff():
print("in f")
return lambda: ff()
```

So we’ve seen how to make a non-recursive function behave as if it was recursive, by having it take a parameter that represents its self reference (to be filled in later) and applying that parameter to itself…

*That is the simple essence of the Y combinator!*

*That is the simple essence of the Y combinator!*

There is not much more than this “trick” to the essence of the Y (or Z) combinator. I argue that the rest is really just plumbing, to abstract the pattern above and make it nicer to use. This plumbing eventually yields the well-known (but inscrutable) `Z`

combinator, as we shall see next.

But first, let us exercise the trick we’ve just devised to redefine our recursive `fact`

function… without using a recursive binding.

### The factorial function in lambda calculus

Armed with this insight, we can rewrite `fact`

using a non-recursive function as follows.

First, we define the structure of the recursive definition:

```
fact_base = (lambda rec, x:
1 if x == 0 else rec(rec, x - 1) * x)
```

This non-recursive `fact_base`

definition takes two arguments: in addition to the input number `x`

, it takes a `rec`

function representing a future recursive reference to `fact_base`

itself.

Next, we “tie the recursive knot” by passing `fact_base`

to itself, and abstract that detail into a self-contained `fact`

function:

```
fact = lambda x: fact_base(fact_base, x)
```

For instance, consider what the expression `fact(3)`

expands to. The call is equivalent to `fact_base(fact_base, 3)`

, which expands to `1 if 3 == 0 else fact_base(fact_base, 3 - 1) * 3`

, which reduces to `fact_base(fact_base, 2) * 3`

, etc. until it reduces to `1 * 1 * 2 * 3 == 6`

.

#### A simplification: currying

Functions with several parameters can be encoded in the lambda calculus via currying, whereby a function taking two parameters is turned into a function that takes the first parameter and returns a function that takes the second parameter.

It is instructive to see how currying simplifies the definition of `fact`

:

```
# assuming the curried:
fact_base = (lambda rec: lambda x:
1 if x == 0 else rec(rec)(x - 1) * x)
# we now just write:
fact = fact_base(fact_base)
```

Mission accomplished! This is a valid lambda term, which indirectly implements recursion to compute the factorial of a number.

Are we done? Well, not quite. It is still pretty inconvenient to write recursive functions in this style, so we’ll see next how to make it nicer.

### Abstracting the outer self application

Writing recursive functions in the style we saw above gets old very quickly. We’d really like to abstract the boilerplate away into easy-to-use helper functions (combinators).

First, let us define a function that performs the outer self application (i.e., `fact_base(fact_base)`

) for us, and let us call that function `mkrec`

, as it will be used to *make* *rec*-ursive functions:

```
mkrec = lambda f: f(f)
fact = mkrec(lambda rec: lambda x:
1 if x == 0 else rec(rec)(x - 1) * x)
```

The definition of `fact`

becomes more straightforward (it can now be defined in one go), but recursive calls still look annoyingly noisy: they have to repeat the `rec`

argument, as in `rec(rec)(x - 1)`

.

### Abstracting the inner self application

We’d prefer to simply write `rec(x - 1)`

instead of `rec(rec)(x - 1)`

in the definition of `fact`

itself. Let us make a function for that, and call it `mkrec_nice`

.

All `mkrec_nice`

has to do is to take a function that uses its `rec`

parameter in the desired style (as in `rec(x - 1)`

) and adapt it to use the “official” style under the hood (as in `rec(rec)(x - 1)`

):

```
mkrec_nice = (lambda g:
mkrec(lambda rec:
g(lambda y: rec(rec)(y))))
fact = mkrec_nice(lambda rec: lambda x:
1 if x == 0 else rec(x - 1) * x)
```

To understand why this works, consider what the call to `mkrec_nice`

expands into:

```
# given:
g = (lambda rec_nice: lambda x:
1 if x == 0 else rec_nice(x - 1) * x)
# we have:
fact = mkrec_nice(g)
# i.e.,
fact = mkrec(lambda rec:
g(lambda y: rec(rec)(y)))
# inlining the call to g, where we substitute mkrec_nice with `lambda y: rec(rec)(y)`:
fact = mkrec(lambda rec:
(lambda x:
1 if x == 0 else (lambda y: rec(rec)(y))(x - 1) * x))
# reducing the corresponding applied lambda, this gives us the original:
fact = mkrec(lambda rec: lambda x:
1 if x == 0 else rec(rec)(x - 1) * x)
```

Now, we’re done! We have created a `mkrec_nice`

combinator that lets us define recursive functions easily, with minimal boilerplate.

### The Z combinator

Plot twist: `mkrec_nice`

is actually “the `Z`

combinator”! But to view it in its traditional form, we first need to perform some obfuscation…

We’ll start by inlining the definition of `mkrec`

inside that of `mkrec_nice`

, which yields:

```
mkrec_nice = (lambda g:
(lambda f: f(f))((lambda rec: g(lambda y: rec(rec)(y)))))
```

Then, we’ll reduce this inlined function — which amounts to substituting `f`

in `f(f)`

with `(lambda rec: g(lambda y: rec(rec)(y)))`

, giving us:

```
mkrec_nice = (lambda g:
( lambda rec: g(lambda y: rec(rec)(y)) )
( lambda rec: g(lambda y: rec(rec)(y)) ))
```

This version of `mkrec_nice`

correspond directly to the traditional inscrutable presentation of `Z`

, which is written in lambda calculus as:

Z = λg. (λr. g (λy. r r y)) (λr. g (λy. r r y))

### The real Y combinator

The real Y combinator is actually a little simpler than the Z combinator that we say above. This is because in non-strict evaluation strategies such as lazy evaluation, not only functions can be recursive, but also plain values. So we can define a combinator that directly creates recursive values, without having to explicitly handle function calls and argument-passing.

More specifically, in Python syntax, we’d use the following definition instead of `mkrec_nice`

:

```
mkrec_lazy = (lambda g:
mkrec(lambda rec:
g(rec(rec))))
```

When unrolled, this corresponds to:

```
mkrec_nice = (lambda g:
( lambda rec: g(rec(rec)) )
( lambda rec: g(rec(rec)) ))
```

which is the traditional Y combinator, or in lambda calculus:

Y = λg. (λr. g (r r)) (λr. g (r r))

However, such a definition cannot actually be used to define `fact`

in Python: it causes an infinite loop because Python evaluates function call arguments too eagerly.