Wednesday, August 14, 2013

Fibonacci Sequence and Linear Recurrences: Part II

This post is mostly concerned with solving recurrences through programming. More specifically, given a homogeneous linear recurrence:

$f(n)=\sum\limits_{i=1}^{d} a_i f(n - i)$

we want to compute the value of $f(n)$ for any given integer $n$.

Fibonacci Sequence

We will use the Fibonacci sequence as a starting point: $F(n) = F(n - 1) + F(n - 2)$, with $F(1) = 1$, $F(0) = 0$. Last time we discovered a closed form solution for $F(n)$:

$F(n) = \frac{1}{\sqrt{5}} (\frac{1 + \sqrt{5}}{2})^n - \frac{1}{\sqrt{5}} (\frac{1 - \sqrt{5}}{2})^n$

This would allow us to compute $F(n)$ in $O(1)$ time, but suffers from precision errors when $n$ gets high. We want a solution that does not need to deal with irrational numbers. The first approach is to write the recurrence explicitly:

FIBONACCI(n):
  if n = 0
      return 0
  if n = 1
      return 1
  return FIBONACCI(n - 1) + FIBONACCI(n - 2)

This works, but has the disadvantage of being very slow. If we let $T(n)$ denote the number of steps it takes to compute FIBONACCI(n), then $T(n)$ satisfies the recurrence:

$T(n) = T(n - 1) + T(n - 2)$

which, as we saw last time, has exponential growth. If we look at the recursion tree generated by the FIBONACCI function we see that we do a lot of unnecessary computation:

Recursion Tree

From the above, we see that we are computing some quantities several times. For instance, $F(n-2)$ will be called twice in the computation of $F(n)$. To avoid unnecessary computation, we introduce two variables $a$, and $b$, that will maintain intermediate results:

FIBONACCI(n, a, b):
  if n = 0:
      return b
  if n = 1:
      return a
  return FIBONACCI(n - 1, a + b, a)

We just need to figure out what the initial values of $a$ and $b$ should be. But since $F(0) = 0$, we have that $b = 0$, and since $F(1) = 1$, we require $a = 1$.

As it turns out, although there is little difference between the two code blocks, the performance difference is huge. If we let $T(n)$ be the number of operations required to compute FIBONACCI(n), we have $T(n) = T(n - 1) + 1$, the $1$ coming from the evaluation of $a+b$ in the recursive call. A solution to that recurrence relation is $T(n) = n$.

Although linear is oftentimes the best we can hope for, in this case we can do better. For the extra performance boost, we have to notice the following identity:

Theorem 1. Define the $2x2$ matrix $C$ as:

$C = \begin{bmatrix} 0 & 1 \\ 1 & 1 \end{bmatrix}$

Then the following identity holds:

$$ \begin{equation}\label{id} C \begin{bmatrix} F(n-1) \\ F(n) \end{bmatrix} = \begin{bmatrix} F(n) \\ F(n + 1) \end{bmatrix} \end{equation}$$

Moreover, a natural extension of this fact is:

$$ \begin{equation}\label{pow} C^n \begin{bmatrix} F(0) \\ F(1) \end{bmatrix} = \begin{bmatrix} F(n) \\ F(n + 1) \end{bmatrix} \end{equation}$$

Proof. The first half is immediate if we multiply the two. For the second half, we will proceed by induction. Denote the statement by $P(n)$. Then:

$P(0): C^0 \begin{bmatrix} F(0) \\ F(1) \end{bmatrix} = I_{2} \begin{bmatrix} F(0) \\ F(1) \end{bmatrix} = \begin{bmatrix} F(0) \\ F(1) \end{bmatrix}$

Suppose $P(n - 1)$ holds. We write $P(n)$ as:

$$ \begin{eqnarray*} P(n): C^n \begin{bmatrix} F(0) \\ F(1) \end{bmatrix} &=& C \left( C^{n-1} \begin{bmatrix} F(0) \\ F(1) \end{bmatrix} \right)\\ &=& C \begin{bmatrix} F(n - 1) \\ F(n) \end{bmatrix} = \begin{bmatrix} F(n) \\ F(n + 1) \end{bmatrix} \end{eqnarray*}$$

The second equality follows from the inductive hypothesis, while the third follows from \eqref{id}.

Corollary.

$$ \begin{equation} C^n = \begin{bmatrix} F(n-1) & F(n) \\ F(n) & F(n + 1) \end{bmatrix} \end{equation}$$

The good thing about powers is that you can compute them in logarithmic time by exploiting the following identities:

$$ \begin{eqnarray*} x^n &=& (x^{\frac{n}{2}})^2, &\forall n, \text{ n even}\\ x^n &=& x (x^\left\lfloor \frac{n}{2} \right\rfloor)^2, &\forall n, \text{ n odd} \end{eqnarray*}$$

Here is some pseudocode. In the following, $X$ can be any type that supports multiplication (including an $nxn$ matrix), and $I$ is the identity element with respect to multiplication.

FAST-POWER(X, n)
  if n = 0
      return I

  Z <- FAST-POWER(X, n/2)
  if n mod 2 = 0
      return Z * Z
  else
      return X * Z * Z

If we let $T(n)$ denote the number of operations, we have $T(n) = T(n/2) + O(1)$, if the multiplication can be performed in constant time. This recurrence has a solution $T(n) = O(log(n))$. I will not prove this now, but intuitively, the problem size gets halved at each step.

With this, we can now proceed to our final algorithm. This computes the $n$-th Fibonacci number in $O(log(n))$ time by exploiting the properties above. $C$ is the matrix from Theorem 1.

FIBONACCI(n)
  Z <- FAST-POWER(C, n)
  return Z[1, 2]

Multiplying two $nxn$ takes $O(n^3)$ time, but since $C$ is a $2x2$ matrix, this is $O(1)$, so all we care about is the time complexity of FAST-POWER, which is $O(log(n))$.


This concludes part II of this series. In part III, we will generalize these results to other linear recurrences.

Friday, August 9, 2013

Fibonacci Sequence and Linear Recurrences: Part I

In this series of short blog posts, I will try to go into detail regarding the Fibonacci sequence in particular, and linear recurrences in general. This first part discusses the Fibonacci sequence, ending with a closed form solution. Part two will go into detail regarding implementation in code, and part three will generalize the first two parts for general linear recurrences.


You probably all know the Fibonacci sequence, but just in case you've forgotten, let me refresh your memories. The classic example is about rabbit populations, so we will use that as a starting point.

Let's say we release two rabbits (a male and female) into the wild. Rabbits are able to mate at the age of 1 month, so that at the end of 2 months the female can produce another pair of rabbits. Rabbits will never die, and mating always produces another pair (a male and a female).

How many rabbit pairs will there be at the end of one year?

Working it out on a piece of scratch paper, we get something like this:

  • After one month, the rabbits mate, but there is still only one pair.
  • After two months, the female produces another pair, so there are now 2 rabbit pairs in the field.
  • After three months, the original female produces another pair, leading to 3 rabbit pairs.
  • At the end of four months, the original female produces yet another pair, and the female born two months ago also produces a pair, making 5 pairs total.

It seems that at the end of each month the number of rabbit pairs is the sum of the number of rabbit pairs in the previous two months. This makes sense, as rabbits do not die, so the population of the previous month is still alive, and all females that are at least two months old give birth to a pair of rabbits.

We can write this formally as:

$F(n) = F(n-1) + F(n-2), \forall n \geq 2$

with $F(1) = 1$, $F(0) = 0$.

The first 10 terms of this sequence are: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34.

Definition 1. A linear homogeneous recurrence relation is an equation of the form:

$f(n) = \sum\limits_{i=1}^{d} a_{i} f(n-i)$

where all the $a_{i}$ are constants.

In the Fibonacci example, $a_{1} = 1, a_{2} = 1$. To solve a linear recurrence relation, one thing we could try to do is guess the solution. The Fibonacci example looks as if it was increasing exponentially, so we can try the guess $F(n) = cx^n$ for some constant $c$.

Replacing in the recurrence relation, we get $cx^n = cx^{n-1} + cx^{n-2}$. Dividing by $cx^{n-2}$ throughout, we are left with the quadratic $x^2 - x - 1 = 0$, which has solutions $\frac{1 \pm \sqrt{5}}{2}$.

So we have two solutions to the recurrence relation:

$F(n) = c(\frac{1 + \sqrt{5}}{2})^n$     $F(n) = c(\frac{1 - \sqrt{5}}{2})^n$

We will digress a bit before arriving at a closed form solution.

Theorem 1. If $f(n)$ and $g(n)$ are two solutions to a recurrence relation, then any linear combination of the two is also a recurrence relation.

Proof. We know that $f(n)$ and $g(n)$ both satisfy the recurrence relation:

$f(n) = \sum\limits_{i=1}^{d} a_{i} f(n-i)$     $g(n) = \sum\limits_{i=1}^{d} a_{i} g(n-i)$

A linear combination of the two yields:

$cf(n)+dg(n) = \sum\limits_{i=1}^{d} a_{i}(c f(n-i) + d g(n-i))$

Using this theorem we can write $F(n) = c_1 (\frac{1 + \sqrt{5}}{2})^n + c_2 (\frac{1 - \sqrt{5}}{2})^n$. Substituting $n=0$ gives us:

$F(0) = c_1 (\frac{1 + \sqrt{5}}{2})^0 + c_2 (\frac{1 + \sqrt{5}}{2})^0 = c_1 + c_2 = 0$

The second boundary condition ($n=1$) yields:

$F(1) = c_1 (\frac{1 + \sqrt{5}}{2}) + c_2(\frac{1 - \sqrt{5}}{2}) = 1$

Solving the system of linear equations gives us $c_1 = \frac{1}{\sqrt{5}}$, $c_2 = -\frac{1}{\sqrt{5}}$. We thus have a closed form solution for the Fibonacci sequence:

$F(n) = \frac{1}{\sqrt{5}} (\frac{1 + \sqrt{5}}{2})^n - \frac{1}{\sqrt{5}} (\frac{1 - \sqrt{5}}{2})^n$


Part two will be about solving recurrences in code.