Recursion as Structural Decomposition
In most introductory computer science courses, recursion is taught as its own contained module. It's almost treated as an afterthought. We're given recursion as a tool, but not shown when to use it. Then, we forget about it; we're unlikely to encounter it in real-world problems, right?
I've come to learn, rather, that recursion is an incredibly useful, elegant, and perhaps niche tool that enables clearer programs. Following the functional philosophy, it can break a problem down into several scenarios, then solve each of them. And, it can be more legible and understandable to introductory developers.
A Line of People
Suppose you're in a line for food, or a concert, or something. You're just in a line of people. If you're very early, you'd be at the front of the line. However, if you're in the unlucky majority, you're after someone, a person. In a simplified sense, this is true for every single person in the line. They're either at the front, or they're after somebody. Let's represent this in code:
/// This represents a possible person in a line. They are either at the front of /// the line, or after somebody else in the line. enum PersonInLine { Front, After(Box<PersonInLine>) }
Let's say the line is absurdly long and we're impatient. We want to know how many people are in front of us so we can grumble and complain. So, what do we do? We ask the person in front of us how many people are in front of them, and in turn, they'll ask how many people are in front of them, etc. until we get to the person in front.
Now, the person in front is in front; there's nobody else before them. So, they know the answer is zero. They relay that back to the person behind them. The person behind them (hopefully) knows that there's one more than what the person in front said, so they relay back "one". The next person in line relays back one more than that, and so on, until it gets back to you. Then, you know how many people are in front of you.
With that explanation, we (very quickly, skipped a few) went through every single person. Let's simplify this. If you're at the front of the line, the number of people in front of you is zero. If you're not, it's one more than whatever the person in front of you said.
What we just did is structural decomposition. A PersonInLine
, by our definition, is either in Front
, or After
someone else. So we consider each case and determine how many people in front there are for that specific case. Let's write a function to solve this.
/// Count the number of people in front of the given person. fn people_in_front_of(person: PersonInLine) -> u32 { match person { // There is no person in front of the person at the front Front => 0, // The person before our current person gives us a number, but we // have to also count the person before, so we add 1 After(person_in_front) => people_in_front_of(*person_in_front) + 1 // ^^^^^^^^^^^^^^^^ // It's a pointer, so we dereference it } }
Recursion felt natural for this problem; it was simple to break down our possible cases and solve it for each case. Our PersonInLine
enum was recursive data, and we showed that recursion naturally follows recursive data.
That's why, when taught recursion, we're told what a base case is: it's a terminating case for our recursive function. In the above example, that was Front
. When creating a recursive function, we have to break down the data to consider all possible cases and handle them accordingly.
The enum
above is actually weird example of Lisp lists (or functional language lists). If we replace its code definition, we actually wrote a function to get the length of a list:
enum List { // PersonInLine Empty, // Front Cons(Box<List>) // After(Box<PersonInLine>) }
Family Trees
A family tree is also an intuitive way of demonstrating recursion. From a traditional, overly-simplified perspective, each person in the family tree has two parents. Those two parents we may know, or their names may be lost to time. If we do know that parent, then that parent has parents. And if we know a parent's parent, then the parent's parent also has parents. Intuitively, we begin to see how we can continue traversing up the family tree. This is an example of recursively-specified data. We could represent that definition in Rust as follows:
/// A Person has a name, a mother who is a parent and a father who is a parent. struct Person { name: String, mother: Parent, father: Parent } /// The Parent may be lost to time (Unknown) or remembered in the family tree /// (Known). enum Parent { Unknown, // We Box<Person> so that Rust's compiler knows the size of Parent. This is // a small quirk of Rust when dealing with interwined data: when types // reference one another. Known(Box<Person>) }
A simple problem, given this definition of a family tree, would be to count the number of known members from a given person. Let's break this down. A Person
has a name
, mother
, and father
. When we want to count the number of known members in a family tree, is the name
particularly relevant? No, so we discard that. The mother
and father
surely matter, so we'll keep that.
Both the mother
and father
are parents, which are either Unknown
or Known
. If they're Unknown
, then what do we do? In this scenario, I don't believe they're part of the family tree; they're not a person. So, we'll count them as zero. If they are Known
, then we have a person (the parent). And with that known parent, a person, we need to count their members of the family tree. But wait, we've done something similar to this before! In the previous paragraph, we broke down a Person
and then broke down a Parent
. We've finished decomposing a Parent
, so all that's left to do for Person
is add all the components together: the number of members in the family tree of the mother and then the father, and finally the Person
itself (1).
Here's that idea in code:
/// Count the number of known people in a person's family tree. fn num_known_people(person: Person) -> u32 { // The number of known people of the parents, plus the person themselves 1 + num_known_in_parent(person.mother) + num_known_in_parent(person.father) } /// Count the number of known people in a parent's family tree. fn num_known_in_parent(parent: Parent) -> u32 { match parent { // The parent is unknown, so we don't count them. Unknown => 0, // The parent is known and is a person. We've already written a function // to count the number of known people from a function, so let's use // that. Known(person) => num_known_people(*person), } }
If you recall our data definitions from above, Person
referred to Parent
, and vice versa. Person
and Parent
are mutually-recursive with one another. But that's not all: the functions we wrote are also mutually-recursive with one another. Similarly to how in the "line" problem the function followed the definition, num_known_people
follows the definition of Person
, and num_known_in_parent
follows the definition of Parent
. To reiterate, recursion naturally follows recursive data.
Factorial and Fibonacci
When taught recursion, students typically learn how to compute the factorial of a number and the fibonacci sequence. It's not intuitive how numbers are recursive until we realize that natural numbers can be defined recursively.
Usually it's debated whether natural numbers start with zero or one. In this scenario, we'll say they start with zero. This how the proof assistant programming languages Lean and Coq define natural numbers.
// A Nat (Natural Number) is a u32 that is one of // - 0 // - Nat + 1 // ^^^^^^^^^ // Recursive data
The natural number is represented by a u32
and is one of two cases: zero, or a natural that succeeds another natural. We can write a function to compute the factorial of a natural number as follows:
/// Calculate the factorial of a given number fn fact(n: u32) -> u32 { match n { // The base case, 0! = 1 0 => 1, // "_" catches all remaining cases _ => n * fact(n - 1) } }
There are two possible values that a natural number can have: 0 or a successor of a natural. We decompose the structure we're given into zero and successor. We know that \(0! = 1\), so the base case is trivial.
The successor case, though, let's examine more in detail. Notice how in our defined structure of a natural number, Nat
can be Nat + 1
. It calls on itself. Similarly, in our fact
function's second branch calls fact
again, recurring on itself. Here's a deeper annotation of the fact
function.
// A Nat (Natural Number) is a u32 that is one of // - 0 // - Nat + 1 /// Calculate the factorial of a given number fn fact(n: u32) -> u32 { match n { // The zero branch does not recur on itself 0 => 1, // The second branch recurs on itself _ => n * fact(n - 1) // ^^^^^ // Here we subtract by one to retrieve the result of the ancestor of our // current Nat. } }
Computing the \(n\)th Fibonacci number is the other introductory problem to recursion. The Fibonacci sequence starts with \(0, 1\), then continues: \(1, 2, 3, 5, 8, 13,\) …
We could theoretically write each fibonacci number down:
\[ f_0 = 0 \\\\ f_1 = 1 \\\\ f_2 = 1 \\\\ f_3 = 2 \\\\ f_4 = 3 \\\\ f_5 = 5 \\\\ f_6 = 8 \\\\ \\cdots \]
Of course, practically, its infeasible to do so: the sequence is infinite. Thankfully, we have an alternative that is cleaner and easier to reason about. We can represent the \(n\)th Fibonacci number mathematically as follows:
\[ f_0 = 0 \\\\ f_1 = 1 \\\\ f_n = f_{n - 1} + f_{n - 2} \]
The last statement, \(f_n = f_{n - 1} + f_{n - 2}\), is called a recurrence: it defines our current element in the sequence \(f_n\) in terms of the ones before it.
If we want to represent this in code, Haskell has an elegant representation.
f 0 = 0 f 1 = 1 f n = f(n - 1) + f(n - 2)
Here, we create a function that takes in the \(i\)th fibonacci number and produces its value. Can you determine the structure of a fibonacci number? It follows exactly the Haskell definition above.
Generative Recursion
Above, we talked about how recursive functions naturally followed recursive data types, and how to recur over their structure. This is structural recursion. In this brief section, I'll elaborate more on generative recursion, where we break the problem down without needing to follow its structure.
Quicksort is a classic example of generative recursion; divide the input at each step into two. Once we have the solutions of those two smaller problems, we recombine them to form the correct solution.
Here is quicksort in OCaml:
let rec quicksort list = match list with | [] -> [] | _::[] -> list | first::rest -> let (less_than, greater_eq_than) = List.partition (fun x -> x < first) rest in quicksort less_than @ [first] @ quicksort greater_eq_than ;;
Above, when the element is one or zero elements, we just return the list. If it is two or more, we take the first element and separate it from the remaining ones. Already rest
is smaller, our guarantee that our recursive function will end at some point.
This first
element will act as our pivot; anything less than it will be in one list (less_than
), and anything greater than or equal to than it will be in another list (greater_eq_than
). This is what List.partition
does; it takes in a predicate that checks if elements are less than the pivot, and the ones that satisfy it go into less_than
, and the ones that don't go into greater_eq_than
.
That effectively splits the problem in half. We sort each list, then append them to one another, with the pivot first
in the middle. We then have our sorted list. Generative recursion is otherwise known as divide and conquer.
Induction and Recursive Structures
Interestingly, mathematical induction and recursion are intricately intertwined. When applying recursion to the natural numbers, we separated the base case \(0\) from the rest of the natural numbers, and handled each seperately. Similarly, induction also operates over recursive structures; with induction, we take a recursive structure and prove a statement is true for all variants of that structure. To do so, we split those possibilities into distinct cases, and prove the statement true for the cases seperately.
For instance, we could prove that each positive integer is the sum of distinct Fibonacci numbers.
Statement: each positive integer is the sum of distinct Fibonacci numbers.
Basis step: for \(n = 1\), \(F_1 = 1\), so the statement is true.
Inductive step:
For \(k \\in \\mathbb{N} \\geq 1\), all positive integers \(n \\leq k\) are the sum of distinct Fibonacci numbers.
Consider \(k + 1\). If \(k + 1\) is a Fibonacci number, then we are done.
Else, \(k + 1\) is the sum of a Fibonacci number \(F_j\) and positive integer \(m\), where \(F_j < k + 1\).
\[ k + 1 = F_j + m \]
Then, since \(F_j >= 1\), \(m < k + 1\). We can then derive that \(m <= k\). From our induction assumption, we then know that \(m\) is made up of distinct Fibonacci numbers.
We now have to show that \(m < F_j\), which we can prove by contradiction.
If \(m >= F_j\), and \(k + 1 = F_j + m\), then \(F_j + m \\geq F_j + F_j = 2 F_j \\geq F_{j+1}\). So, \(k + 1 > F_{j+1} > F_j\), but we assumed \(F_j\) to be the smallest Fibonacci number less than \(m + 1\). This is a contradiction.
Since \(m \\lt F_j\), then \(m\) does not include \(F_j\) in its sum of Fibonacci numbers. So, \(m + 1\) must be the sum of distinct Fibonacci numbers: the distinct Fibonacci numbers of \(m\) plus \(F_j\). QED.
As shown from above, we break the problem down into two cases: when \(n = 1\), and when it isn't. When \(n \\neq 1\), we break the problem down further with recursion.
Recursion's Drawbacks in Popular Languages
Programming languages have a call stack, that stores the context and information of a program. In many languages, each recursive call adds a new layer to the call stack (with functions returning popping off a value from the call stack). These call stacks have a ceiling, preventing itself from exceeding a certain number of layers. Many functional languages implement tail-call optimization (TCO), wherein additional recursive calls pop the current frame and push the new one to the call stack. This is more efficient space-wise; it's harder to hit the ceiling.
Unfortunately, numerous imperative programming languages do not implement TCO, making recursion infeasible for arbitrarily large recursive data. Rust and Python are languages guilty of this: Python's creator Guido van Rossum reasoned that context from stack traces would be lost.
As such, while recursion provides the opportunity for elegant solutions, in certain languages it becomes impractical. We have to be wary of this when working with recursion in those languages, and use it with recursive data that won't exceed the call stack limit.
There's also the observation that computing Fibonacci numbers is inefficient, having an upper-bound time complexity of \(O(2^n)\) when implemented recursively. In Python, however, there's a neat decorator called @cache
which optimizes this. If the function is pure, that is, will always provide the same output for a given input, then using @cache
will store the results, returning them if the same exact arguments are passed again.
from functools import cache @cache def fib(n): # Here is the handy observation that f 0 = 0 and f 1 = 1. From this we can # simplify our checks if n <= 1: return n else: return fib(n - 1) + fib(n - 2)
While this doesn't quite resolve the max call stack issue (calling fib(1024)
will raise a RecursionError
, but fib(1023)
will not), it does solve the optimization one. In functional languages, it's usually best practice to write pure functions. Similar to Python, this is an avenue of optimization.
Recursion is Beautiful
To me, recursion enables some of the most elegant solutions. The OCaml quicksort implementation is one of them. Recursion is the idea that what you break down and pass on will return the correct value, and from that correct value you can formulate a correct solution. Induction is the idea that the assumptions of our smaller parts, when used, can prove the larger whole to be true. Recursion and induction fundamentally operate on recursive structures; there's a reliance on creating smaller and smaller problems to solve, dividing and conquering. Recursion is one of the few steps to appreciating functional programming, and the beautiful programs it can create.
More
- Learn Rust With Entirely Too Many Linked Lists
- How to Design Programs - More on Generative Recursion
- How Not to Teach Recursion