Profile Lucas Sta Maria

I'm a final year Computer Science and Mathematics student at Northeastern University interested in programming languages and developer tooling.

(λx.x x) (λx.x x)

Recursion as Structural Decomposition

In most introductory computer science courses, recursion is taught as its own contained module. It's almost treated as an afterthought. We're given recursion as a tool, but not shown when to use it. Then, we forget about it; we're unlikely to encounter it in real-world problems, right? That was my experience and perspective in my introductory high school computer science class (APCS).

I've come to learn, rather, that recursion is an incredibly useful, elegant, and perhaps niche tool that enables clearer programs. Following the functional philosophy, it can break a problem down into several scenarios, then solve each of them. And, it can be more legible and understandable to introductory developers.

A Line of People

Suppose you're in a line for food, or a concert, or something. You're just in a line of people. If you're very early, you'd be at the front of the line. However, if you're in the unlucky majority, you're after someone, a person. In a simplified sense, this is true for every single person in the line. They're either at the front, or they're after somebody. Let's represent this in code:

/// This represents a possible person in a line. They are either at the front of
/// the line, or after somebody else in the line.
enum PersonInLine {
    Front,
    After(Box<PersonInLine>)
}

Let's say the line is absurdly long and we're impatient. We want to know how many people are in front of us so we can grumble and complain. So, what do we do? We ask the person in front of us how many people are in front of them, and in turn, they'll ask how many people are in front of them, etc. until we get to the person in front.

Now, the person in front is in front; there's nobody else before them. So, they know the answer is zero. They relay that back to the person behind them. The person behind them (hopefully) knows that there's one more than what the person in front said, so they relay back "one". The next person in line relays back one more than that, and so on, until it gets back to you. Then, you know how many people are in front of you.

With that explanation, we (very quickly, skipped a few) went through every single person. Let's simplify this. If you're at the front of the line, the number of people in front of you is zero. If you're not, it's one more than whatever the person in front of you said.

What we just did is structural decomposition. A PersonInLine, by our definition, is either in Front, or After someone else. So we consider each case and determine how many people in front there are for that specific case. Let's write a function to solve this.

/// Count the number of people in front of the given person.
fn people_in_front_of(person: PersonInLine) -> u32 {
    match person {
        // There is no person in front of the person at the front
        Front => 0,
        // The person before our current person gives us a number, but we
        // have to also count the person before, so we add 1
        After(person_in_front) => people_in_front_of(*person_in_front) + 1
        //                                           ^^^^^^^^^^^^^^^^
        //                         It's a pointer, so we dereference it
    }
}

Recursion felt natural for this problem; it was simple to break down our possible cases and solve it for each case. Our PersonInLine enum was recursive data, and we showed that recursion naturally follows recursive data.

That's why, when taught recursion, we're told what a base case is: it's a terminating case for our recursive function. In the above example, that was Front. When creating a recursive function, we have to break down the data to consider all possible cases and handle them accordingly.

The enum above is actually weird example of Lisp lists (or functional language lists). If we replace its code definition, we actually wrote a function to get the length of a list:

enum List { // PersonInLine
    Empty, // Front
    Cons(Box<List>) // After(Box<PersonInLine>)
}

Family Trees

A family tree is also an intuitive way of demonstrating recursion. From a traditional, overly-simplified perspective, each person in the family tree has two parents. Those two parents we may know, or their names may be lost to time. If we do know that parent, then that parent has parents. And if we know a parent's parent, then the parent's parent also has parents. Intuitively, we begin to see how we can continue traversing up the family tree. This is an example of recursively-specified data. We could represent that definition in Rust as follows:

/// A Person has a name, a mother who is a parent and a father who is a parent.
struct Person {
    name: String,
    mother: Parent,
    father: Parent
}

/// The Parent may be lost to time (Unknown) or remembered in the family tree
/// (Known).
enum Parent {
    Unknown,
    // We Box<Person> so that Rust's compiler knows the size of Parent. This is
    // a small quirk of Rust when dealing with interwined data: when types
    // reference one another.
    Known(Box<Person>)
}

A simple problem, given this definition of a family tree, would be to count the number of known members from a given person. Let's break this down. A Person has a name, mother, and father. When we want to count the number of known members in a family tree, is the name particularly relevant? No, so we discard that. The mother and father surely matter, so we'll keep that.

Both the mother and father are parents, which are either Unknown or Known. If they're Unknown, then what do we do? In this scenario, I don't believe they're part of the family tree; they're not a person. So, we'll count them as zero. If they are Known, then we have a person (the parent). And with that known parent, a person, we need to count their members of the family tree. But wait, we've done something similar to this before! In the previous paragraph, we broke down a Person and then broke down a Parent. We've finished decomposing a Parent, so all that's left to do for Person is add all the components together: the number of members in the family tree of the mother and then the father, and finally the Person itself (1).

Here's that idea in code:

/// Count the number of known people in a person's family tree.
fn num_known_people(person: Person) -> u32 {
    // The number of known people of the parents, plus the person themselves
    1 + num_known_in_parent(person.mother) + num_known_in_parent(person.father)
}

/// Count the number of known people in a parent's family tree.
fn num_known_in_parent(parent: Parent) -> u32 {
    match parent {
        // The parent is unknown, so we don't count them.
        Unknown => 0,
        // The parent is known and is a person. We've already written a function
        // to count the number of known people from a function, so let's use
        // that.
        Known(person) => num_known_people(*person),
    }
}

If you recall our data definitions from above, Person referred to Parent, and vice versa. Person and Parent are mutually-recursive with one another. But that's not all: the functions we wrote are also mutually-recursive with one another. Similarly to how in the "line" problem the function followed the definition, num_known_people follows the definition of Person, and num_known_in_parent follows the definition of Parent. To reiterate, recursion naturally follows recursive data.

Factorial and Fibonacci

When taught recursion, students typically learn how to compute the factorial of a number and the fibonacci sequence. It's not intuitive how numbers are recursive until we realize that natural numbers can be defined recursively.

Usually it's debated whether natural numbers start with zero or one. In this scenario, we'll say they start with zero. This how the proof assistant programming languages Lean and Coq define natural numbers.

// A Nat (Natural Number) is a u32 that is one of
// - 0
// - Nat + 1
// ^^^^^^^^^
// Recursive data

The natural number is represented by a u32 and is one of two cases: zero, or a natural that succeeds another natural. We can write a function to compute the factorial of a natural number as follows:

/// Calculate the factorial of a given number
fn fact(n: u32) -> u32 {
    match n {
        // The base case, 0! = 1
        0 => 1,
        // "_" catches all remaining cases
        _ => n * fact(n - 1)
    }
}

There are two possible values that a natural number can have: 0 or a successor of a natural. We decompose the structure we're given into zero and successor. We know that \(0! = 1\), so the base case is trivial.

The successor case, though, let's examine more in detail. Notice how in our defined structure of a natural number, Nat can be Nat + 1. It calls on itself. Similarly, in our fact function's second branch calls fact again, recurring on itself. Here's a deeper annotation of the fact function.

// A Nat (Natural Number) is a u32 that is one of
// - 0
// - Nat + 1

/// Calculate the factorial of a given number
fn fact(n: u32) -> u32 {
    match n {
        // The zero branch does not recur on itself
        0 => 1,
        // The second branch recurs on itself
        _ => n * fact(n - 1)
        //            ^^^^^
        // Here we subtract by one to retrieve the result of the ancestor of our
        // current Nat.
    }
}

Computing the \(n\)th Fibonacci number is the other introductory problem to recursion. The Fibonacci sequence starts with \(0, 1\), then continues: \(1, 2, 3, 5, 8, 13,\) …

We could theoretically write each fibonacci number down:

\[ f_0 = 0 \\\\ f_1 = 1 \\\\ f_2 = 1 \\\\ f_3 = 2 \\\\ f_4 = 3 \\\\ f_5 = 5 \\\\ f_6 = 8 \\\\ \\cdots \]

Of course, practically, its infeasible to do so: the sequence is infinite. Thankfully, we have an alternative that is cleaner and easier to reason about. We can represent the \(n\)th Fibonacci number mathematically as follows:

\[ f_0 = 0 \\\\ f_1 = 1 \\\\ f_n = f_{n - 1} + f_{n - 2} \]

The last statement, \(f_n = f_{n - 1} + f_{n - 2}\), is called a recurrence: it defines our current element in the sequence \(f_n\) in terms of the ones before it.

If we want to represent this in code, Haskell has an elegant representation.

f 0 = 0
f 1 = 1
f n = f(n - 1) + f(n - 2)

Here, we create a function that takes in the \(i\)th fibonacci number and produces its value. Can you determine the structure of a fibonacci number? It follows exactly the Haskell definition above.

Generative Recursion

Above, we talked about how recursive functions naturally followed recursive data types, and how to recur over their structure. This is structural recursion. In this brief section, I'll elaborate more on generative recursion, where we break the problem down without needing to follow its structure.

Quicksort is a classic example of generative recursion; divide the input at each step into two. Once we have the solutions of those two smaller problems, we recombine them to form the correct solution.

Here is quicksort in OCaml:

let rec quicksort list =
    match list with
    | [] -> []
    | _::[] -> list
    | first::rest ->
            let (less_than, greater_eq_than) = List.partition (fun x -> x < first) rest in
            quicksort less_than @ [first] @ quicksort greater_eq_than ;;

Above, when the element is one or zero elements, we just return the list. If it is two or more, we take the first element and separate it from the remaining ones. Already rest is smaller, our guarantee that our recursive function will end at some point.

This first element will act as our pivot; anything less than it will be in one list (less_than), and anything greater than or equal to than it will be in another list (greater_eq_than). This is what List.partition does; it takes in a predicate that checks if elements are less than the pivot, and the ones that satisfy it go into less_than, and the ones that don't go into greater_eq_than.

That effectively splits the problem in half. We sort each list, then append them to one another, with the pivot first in the middle. We then have our sorted list. Generative recursion is otherwise known as divide and conquer.

Induction and Recursive Structures

Interestingly, mathematical induction and recursion are intricately intertwined. When applying recursion to the natural numbers, we separated the base case \(0\) from the rest of the natural numbers, and handled each seperately. Similarly, induction also operates over recursive structures; with induction, we take a recursive structure and prove a statement is true for all variants of that structure. To do so, we split those possibilities into distinct cases, and prove the statement true for the cases seperately.

For instance, we could prove that each positive integer is the sum of distinct Fibonacci numbers.

Statement: each positive integer is the sum of distinct Fibonacci numbers.

Basis step: for \(n = 1\), \(F_1 = 1\), so the statement is true.

Inductive step:

For \(k \\in \\mathbb{N} \\geq 1\), all positive integers \(n \\leq k\) are the sum of distinct Fibonacci numbers.

Consider \(k + 1\). If \(k + 1\) is a Fibonacci number, then we are done.

Else, \(k + 1\) is the sum of a Fibonacci number \(F_j\) and positive integer \(m\), where \(F_j < k + 1\).

\[ k + 1 = F_j + m \]

Then, since \(F_j >= 1\), \(m < k + 1\). We can then derive that \(m <= k\). From our induction assumption, we then know that \(m\) is made up of distinct Fibonacci numbers.

We now have to show that \(m < F_j\), which we can prove by contradiction.

If \(m >= F_j\), and \(k + 1 = F_j + m\), then \(F_j + m \\geq F_j + F_j = 2 F_j \\geq F_{j+1}\). So, \(k + 1 > F_{j+1} > F_j\), but we assumed \(F_j\) to be the smallest Fibonacci number less than \(m + 1\). This is a contradiction.

Since \(m \\lt F_j\), then \(m\) does not include \(F_j\) in its sum of Fibonacci numbers. So, \(m + 1\) must be the sum of distinct Fibonacci numbers: the distinct Fibonacci numbers of \(m\) plus \(F_j\). QED.

As shown from above, we break the problem down into two cases: when \(n = 1\), and when it isn't. When \(n \\neq 1\), we break the problem down further with recursion.

Recursion's Drawbacks in Popular Languages

Programming languages have a call stack, that stores the context and information of a program. In many languages, each recursive call adds a new layer to the call stack (with functions returning popping off a value from the call stack). These call stacks have a ceiling, preventing itself from exceeding a certain number of layers. Many functional languages implement tail-call optimization (TCO), wherein additional recursive calls pop the current frame and push the new one to the call stack. This is more efficient space-wise; it's harder to hit the ceiling.

Unfortunately, numerous imperative programming languages do not implement TCO, making recursion infeasible for arbitrarily large recursive data. Rust and Python are languages guilty of this: Python's creator Guido van Rossum reasoned that context from stack traces would be lost.

As such, while recursion provides the opportunity for elegant solutions, in certain languages it becomes impractical. We have to be wary of this when working with recursion in those languages, and use it with recursive data that won't exceed the call stack limit.

There's also the observation that computing Fibonacci numbers is inefficient, having an upper-bound time complexity of \(O(2^n)\) when implemented recursively. In Python, however, there's a neat decorator called @cache which optimizes this. If the function is pure, that is, will always provide the same output for a given input, then using @cache will store the results, returning them if the same exact arguments are passed again.

from functools import cache

@cache
def fib(n):
    # Here is the handy observation that f 0 = 0 and f 1 = 1. From this we can
    # simplify our checks
    if n <= 1:
        return n
    else:
        return fib(n - 1) + fib(n - 2)

While this doesn't quite resolve the max call stack issue (calling fib(1024) will raise a RecursionError, but fib(1023) will not), it does solve the optimization one. In functional languages, it's usually best practice to write pure functions. Similar to Python, this is an avenue of optimization.

Recursion is Beautiful

To me, recursion enables some of the most elegant solutions. The OCaml quicksort implementation is one of them. Recursion is the idea that what you break down and pass on will return the correct value, and from that correct value you can formulate a correct solution. Induction is the idea that the assumptions of our smaller parts, when used, can prove the larger whole to be true. Recursion and induction fundamentally operate on recursive structures; there's a reliance on creating smaller and smaller problems to solve, dividing and conquering. Recursion is one of the few steps to appreciating functional programming, and the beautiful programs it can create.

More