Written by Luka Kerr on August 1, 2020
Introduction
An algorithm is a collection of precisely dened steps that are executable using certain specified mechanical methods.
A sequential deterministic algorithm is an algorithm that is given a sequences of steps where only one step can be executed at a time and where each action of each step produces the same result whenever this step is executed for the same input.
Examples Of Algorithms
Two Thieves
Two thieves have robbed a warehouse and have to split a pile of items without price tags on them. Design an algorithm for doing this in a way that ensures that each thief believes that he has got at least one half of the loot.
One of the two thieves splits the pile in two parts, so that he believes that both parts are of equal value. The other thief then chooses the part that he believes is no worse than the other.
Three Thieves
Three thieves have robbed a warehouse and have to split a pile of items without price tags on them. How do they do this in a way that ensures that each thief believes that he has got at least one third of the loot?
T1 makes a pile P1 which he believes is 1/3 of the whole loot;
T1 proceeds to ask T2 if T2 agrees that P1 <= 1/3;
If T2 says YES, then T1 asks T3 if T3 agrees that P1 <= 1/3;
If T3 says YES, then T1 takes P1;
T2 and T3 split the rest as in Problem 1.
Else if T3 says NO, then T3 takes P1;
T1 and T2 split the rest as in Problem 1.
Else if T2 says NO, then T2 reduces the size of P1 to P2 < P1 such that T2 thinks P2 = 1/3;
T2 then proceeds to ask T3 if he agrees that P2 <= 1/3;
If T3 says YES then T2 takes P2;
T1 and T3 split the rest as in Problem 1.
Else if T3 says NO then T3 takes P2;
T1 and T2 split the rest as in Problem 1.
Role Of Proofs
We need to give a mathematical proof that an algorithm we have designed terminates and returns a solution to the problem at hand when this is not obvious by inspecting the algorithm using common sense.
Divide & Conquer
Counterfeit Coin Puzzle
We are given 27 coins of the same denomination; we know that one of them is counterfeit and that it is lighter than the others. Find the counterfeit coin by weighing coins on a pan balance only three times.
Split the coins into three groups of nine. Place two groups on the balance. There are three outcomes:
- The balance is tilted on the left, so counterfeit coin is in the right group
- The balance is tilted on the right, so counterfeit coin is in the left group
- The balance is evenly distributed, so counterfeit coin is in the remaining unchosen group
Repeat the same process on the chosen group (split the 9 coins into three groups of 3). Repeat the same process on the 3 coins. This methodology is known as divide and conquer.
Counting Inversions
An inversion is a pair $(i, j)$ such that $i < j$ but $a[i] > a[j]$.
To count the number of inversions in an array $A$, we need to tweak the $\text{Merge-Sort}$ algorithm, by extending it to recursively both sort $A$ and determine the number of inversions in $A$.
- Split $A$ into two (approximately) equal parts $A_{top} = A[1 \dots \lfloor \frac{n}{2} \rfloor]$ and $A_{bottom} = [\lfloor \frac{n}{2} \rfloor + 1 \dots n]$.
- Recursively sort arrays $A_{top}$ and $A_{bottom}$ and obtain the number of inversions $I(A_{top})$ and $I(A_{bottom})$ respectively
- Merge $A_{top}$ and $A_{bottom}$ while counting the number of inversions across the two sub-arrays
Adding Two Numbers
- Adding 3 bits (carry and two numbers) can be done in constant time
- Addition runs in linear time ($\mathcal{O}(n)$ many steps)
- Cannot be made faster since the algorithm needs to read every bit of input
Multiplying Two Numbers
- Assume that two $X$’s can be multiplied in constant time
- Algorithm runs in $\mathcal{O}(n^2)$ time
Divide & Conquer Approach
Take two input numbers, $A$ and $B$ and split them in two halves $A = A_1 \cdot 2^{\frac{n}{2}} + A_0$ and $B = B_1 \cdot 2^{\frac{n}{2}} + B_0$ where $A_1$ and $B_1$ are the most significant bits and $A_0$ and $B_0$ are the least significant bits.
$AB$ can now be calculated as $AB = A_1 B_1 2^n + (A_1 B_0 + B_1 A_0) 2^{\frac{n}{2}} + A_0 B_0$.
The product $AB$ can be calculated recursively as follows:
function Mult(A, B) {
if |A| == |B| == 1 then return AB
else
A_1 = MoreSignificantPart(A);
A_0 = LessSignificantPart(A);
B_1 = MoreSignificantPart(B);
B_0 = LessSignificantPart(B);
X = Mult(A_0, B_0);
Y = Mult(A_0, B_1);
Z = Mult(A_1, B_0);
W = Mult(A_1, B_1);
return W * 2^n + (Y + Z) * 2^(n/2) + X;
end if
}
Each multiplication of two $n$ digit numbers is replaced by four multiplications of $\frac{n}{2}$ digit numbers: $A_1, B_1$, $A_1, B_0$, $B_1, A_0$ and $A_0, B_0$, plus we have a linear overhead to shift and add: $T(n) = 4T (\frac{n}{2}) + cn$.
Claim: if $T(n)$ satisfies $T(n) = 4T (\frac{n}{2}) + cn$ then $T(n) = n^2 (c + 1) - cn$.
Proof: by ‘fast’ induction. We assume it is true for $\frac{n}{2}$: $T(\frac{n}{2}) = (\frac{n}{2})^2 (c + 1) - c \frac{n}{2}$ and prove it is also true for $n$:
\[\begin{array}{lcl} T(n) & = & 4T (\frac{n}{2}) + cn \\ & = & 4 ((\frac{n}{2})^2 (c + 1) - \frac{n}{2}c) + cn \\ & = & n^2 (c + 1) - 2cn + cn \\ & = & n^2(c + 1) - cn \\ & \equiv & \mathcal{O}(n^2) \end{array}\]So, we gained nothing with the divide and conquer method.
The Karatsuba Trick
Again, we split $A$ and $B$ into two halves $A = A_1 \cdot 2^{\frac{n}{2}} + A_0$ and $B = B_1 \cdot 2^{\frac{n}{2}} + B_0$.
We then calculate $AB$ as $AB = A_1 B_1 2^n = ((A_1 + A_0)(B_1 + B_0) - A_1 B_1 - A_0 B_0) 2^{\frac{n}{2}} + A_0 B_0$, saving us one multiplication at each recursion round.
The algorithm is as follows:
function Mult(A, B) {
if |A| == |B| == 1 then return AB
else
A_1 = MoreSignificantPart(A);
A_0 = LessSignificantPart(A);
B_1 = MoreSignificantPart(B);
B_0 = LessSignificantPart(B);
U = A_0 + A_1;
V = B_0 + B_1;
X = Mult(A_0, B_0);
W = Mult(A_1, B_1);
Y = Mult(U, V);
return W * 2^n + (Y - X - W) * 2^(n/2) + X;
end if
}
The runtime $T(n)$ of this algorithm satisfies $T(n) = 3T (\frac{n}{2}) + cn$.
- Replacing $n$ with $\frac{n}{2}$ we get $T(\frac{n}{2}) = 3T (\frac{n}{n^2}) + c \frac{n}{2}$
- Replacing $n$ with $\frac{n}{2^2}$ we get $T(\frac{n}{n^2}) = 3T (\frac{n}{2^3}) + c \frac{n}{2^2}$
So
\[\begin{array}{lcl} T(n) & = & 3T (\frac{n}{2}) + \frac{cn}{2} \\ & = & 3 [3T (\frac{n}{2^2}) + \frac{cn}{2}] + cn \\ & = & 3^2T (\frac{n}{2^2}) + nc [1 + \frac{3}{2}] \\ & \vdots & \\ & = & 3^{\lfloor \log_2 n \rfloor} T (\frac{n}{\lfloor \log_2 n \rfloor}) + cn [1 + \frac{3}{2} + \cdots + (\frac{3}{2})^{\lfloor \log_2 n \rfloor - 1}] \\ & = & 3^{\lor_2 n} T(1) + cn [\frac{(3/2)^{\log_2 n} - 1}{3/2 - 1}] \\ & \simeq & 3^{\log_2 n} T(1) = 2cn [(\frac{3}{2}^{\log_2 n}) - 1] \end{array}\]Then using the fact that $a^{\log_b n} = n^{\log_b a}$ we have
\[\begin{array}{lcl} T(n) & = & n^{\log_2 3} T(1) + 2cn [n^{\log_2 (3/2)} - 1] \\ & = & n^{\log_2 3} T(1) + 2cn [n^{\log_2 3 - 1} - 1] \\ & = & n^{\log_2 3} T(1) + 2c \cdot n^{\log_2 3} - 2cn \\ & = & \mathcal{O}(n^{\log_2 3}) \\ & = & \mathcal{O}(n^{1.58\dots}) < \mathcal{O}(n^2) \end{array}\]Recurrences
Asymptotic Notation
‘Big Oh’ notation $f(n) = \mathcal{O}(g(n))$ is an abbreviation for: “there exist positive constants $c$ and $n_0$ such that $0 \le f(n) \le c g(n)$ for all $n \ge n_0$”.
- In this case we say that $g(n)$ is an asymptotic upper bound for $f(n)$
- $f(n) = \mathcal{O}(g(n))$ means that $f(n)$ does not grow substantially faster than $g(n)$ because a multiple of $g(n)$ eventually dominates $f(n)$
‘Omega’ notation $f(n) = \Omega(g(n))$ is an abbreviation for: “there exists positive constants $c$ and $n_0$ such that $0 \le c g(n) \le f(n)$ for all $n \ge n_0$.
- In this case we say that $g(n)$ is an asymptotic lower bound for $f(n)$
- $f(n) = \Omega(g(n))$ essentially says that $f(n)$ grows at least as fast as $g(n)$, because $f(n)$ eventually dominates a multiple of $g(n)$
- Since $c g(n) \le f(n)$ if and only if $g(n) \le \frac{1}{c} f(n)$ we have $f(n) = \omega(g(n))$ if and only if $g(n) = \mathcal{O}(f(n))$
‘Theta’ notation $f(n) = \Theta(g(n))$ only holds if and only if $f(n) = \mathcal{O}(g(n))$ and $f(n) = \sum(g(n))$; thus, $f(n)$ and $g(n)$ have the same asymptotic growth rate.
Recurrences
Recurrences are important to us because they arise in estimations of time complexity of divide-and-conquer algorithms.
Let $a \ge 1$ be an integer and $b > 1$ a real number
Assume that a divide-and-conquer algorithm:
- reduces a problem of size $n$ to $a$ many problems of smaller size $n / b$
- the overhead cost of splitting up/combining the solutions for size $n / b$ into a solution for size $n$ is $f(n)$
then the time complexity of such algorithm satisfies $T(n) = aT (\frac{n}{b}) + f(n)$.
Some recurrences can be solved explicitly, but this tends to be tricky. Fortunately, to estimate the efficiency of an algorithm we do not need the exact solution of a recurrence. Rather, we only need to find:
- the growth rate of the solution i.e. its asymptotic behaviour
- the (approximate) sizes of the constants involved.
This is what the Master Theorem provides (when it is applicable).
Master Theorem
Let
- $a \ge 1$ be an integer and and $b > 1$ a real
- $f(n) > 0$ be a non-decreasing function
- $T(n)$ be the solution of the recurrence $T(n) = aT (\frac{n}{b}) + f(n)$
Then:
- If $f(n) = \mathcal{O}(n^{\log_b a - \epsilon})$ for some $\epsilon > 0$ then $T(n) = \Theta(n^{\log_b a})$
- If $f(n) = \Theta(n^{\log_b a})$ then $T(n) = \Theta(n^{\log_b a} \log_2 n)$
- If $f(n) = \Omega(n^{\log_b a + \epsilon})$ for some $\epsilon > 0$ and for some $c < 1$ and some $n_0$, $a f(\frac{n}{b}) \le c f(n)$ holds for all $n > n_0$, then $T(n) = \Theta(f(n))$
If none of these conditions hold, the Master Theorem is not applicable.
Examples
Let $T(n) = 4T(\frac{n}{2}) + n$, then $n^{\log_b a} = n^{\log_2 4} = n^2$, thus $f(n) = n = \mathcal{O}(n^{2 - \epsilon})$ for any $\epsilon < 1$. The condition of case (1) is satisfied and so $T(n) = \Theta(n^2)$.
Let $T(n) = 2T(\frac{n}{2}) + cn$, then $n^{\log_b a} = n^{\log_2 2} = n^1 = n$, thus $f(n) = cn = \Theta(n) = \Theta(n^{\log_2 2})$. The condition of case (2) is satisfied and so $T(n) = \Theta(n^{\log_2 2} \log n) = \Theta(n \log n)$.
Let $T(n) = 3T(\frac{n}{4}) + n$, then $n^{\log_b a} = n^{\log_4 3} < n^{0.8}$, thus $f(n) = n = \Omega(n^{0.8 + \epsilon})$ for any $\epsilon < 0.2$. Also, $af(\frac{n}{b}) = 3f(\frac{n}{4}) = \frac{3}{4} < cn = cf(n)$ for $c = 0.8 < 1$..The condition of case (3) is satisfied and so $T(n) = \Theta(f(n)) = \Theta(n)$.
Fast Large Integer Multiplication
Generalizing Karatsuba’s Algorithm
Slice the input numbers $A$, $B$ into $n + 1$ many slices. For simplicity, let $A$, $B$ have $(n + 1)k$ bits ($k$ can be arbitrarily large but $n$ is fixed).
Slice $A$, $B$ into $n + 1$ pieces each:
\[\begin{array}{lcl} A & = & A_n 2^{kn} + A_{n - 1} 2^{k(n - 1)} + \dots + A_0 \\ B & = & B_n 2^{kn} + B_{n - 1} 2^{k(n - 1)} + \dots + B_0 \\ \end{array}\]Form the naturally corresponding polynomials:
\[\begin{array}{lcl} P_A(x) & = & A_n x^n + A_{n - 1} x^{n - 1} + \dots + A_0 \\ P_B(x) & = & B_n x^n + B_{n - 1} x^{n - 1} + \dots + B_0 \\ \end{array}\]We have $A = P_A (2^k)$, $B = P_B (2^k)$ and $AB = P_A(2^k) P_B(2^k) = (P_A(x) \cdot P_B(x)) |_{x = 2^k}$.
We adopt the following strategy:
- First figure out how to multiply polynomials fast to obtain $P_C(x) = P_A(x) \cdot P_B(x)$
- Then evaluate $P_C(2^k)$
We need to find the coefficients $C_j = \sum_{i + k = j} A_i B_k$ without performing $(p + 1)^2$ many multiplications necessary to get all products of the form $A_i B_k$.
Linear Convolution Of Sequences
If you have two sequences $\vec{A} = (A_0, A_1, \dots, A_{p - 1}, A_p)$ and $\vec{A} = (B_0, B_1, \dots, B_{m - 1}, B_m)$, and if you form the two corresponding polynomials
\[\begin{array}{lcl} P_A(x) & = & A_p x^p + A_{p - 1} x^{p - 1} + \dots + A_1 x + A_0 \\ P_B(x) & = & B_m x^m + B_{m - 1} x^{m - 1} + \dots + B_1 x + B_0 \\ \end{array}\]and if you multiply these two polynomials to obtain their product
\[P_A(x) \cdot P_B(x) = \sum_{j = 0}^{m + p} \left( \sum_{i + k = j} A_i B_k \right) x^j = \sum_{j = 0}^{p + m} C_j x^j\]then the sequence $\vec{C} = (C_0, C_1, \dots, C_{p + m})$ of the coefficients of the product polynomial, with these coefficients given by
\[C_j = \sum_{i + k = j} A_i B_k, \quad \text{for} \quad 0 \le j \le p + m\]is called the linear convolution of sequences $\vec{A}$ and $\vec{B}$ and is denoted by $\vec{C} = \vec{A} \ast \vec{B}$.
Vandermonde Matrix
Every polynomial $P_A(x)$ of degree $p$ is uniquely determined by its values at any $p + 1$ distinct input values $x_0, x_1, \dots, x_p$:
\[P_A(x) \leftrightarrow \{ (x_0, P_A(x_0)), (x_1, P_A(x_1)), \dots, (x_p, P_A(x_p)) \}\]For $P_A(x) = A_px^p + A_{p - 1}x^{p - 1} + \dots + A_0$, these values can be obtained via a matrix multiplication:
\[\begin{pmatrix} 1 & x_0 & x_0^2 & \dots & x_0^p \\ 1 & x_1 & x_1^2 & \dots & x_1^p \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & x_p & x_p^2 & \dots & x_p^p \end{pmatrix} \begin{pmatrix} A_0 \\ A_1 \\ \vdots \\ A_p \end{pmatrix} = \begin{pmatrix} P_A(x_0) \\ P_A(x_1) \\ \vdots \\ P_A(x_p) \end{pmatrix} \qquad (1)\]It can be shown that if $x_i$ are all distinct then this matrix is invertible. Such a matrix is called the Vandermonde Matrix.
If all $x_i$ are distinct, given any values $P_A(x_0), P_A(x_1), \dots, P_A(x_p)$ the coefficients $A_0, A_1, \dots, A_p$ of the polynomial $P_A(x)$ are uniquely determined:
\[\begin{pmatrix} A_0 \\ A_1 \\ \vdots \\ A_p \end{pmatrix} \begin{pmatrix} 1 & x_0 & x_0^2 & \dots & x_0^p \\ 1 & x_1 & x_1^2 & \dots & x_1^p \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & x_p & x_p^2 & \dots & x_p^p \end{pmatrix}^{-1} = \begin{pmatrix} P_A(x_0) \\ P_A(x_1) \\ \vdots \\ P_A(x_p) \end{pmatrix} \qquad (2)\]Equations (1) and (2) show how we can commute between:
- A representation of a polynomial $P_A(x)$ via its coefficients $A_p, A_{p - 1}, \dots, A_0$, i.e. $P_A(x) = A_p x^p + \dots + A_1 x + A_0$
- A representation of a polynomial $P_A(x)$ via its values \(P_A(x) \leftrightarrow \{ (x_0, P_A(x_0)), (x_1, P_A(x_1)), \dots, (x_p, P_A(x_p)) \}\)
If we fix the inputs $x_0, x_1, \dots, x_p$ then commuting between a representation of a polynomial $P_A(x)$ via its coefficients and a representation via its values at these points is done via the following two matrix multiplications, with matrices made up from constants:
\[\begin{pmatrix} P_A(x_0) \\ P_A(x_1) \\ \vdots \\ P_A(x_p) \end{pmatrix} = \begin{pmatrix} 1 & x_0 & x_0^2 & \dots & x_0^p \\ 1 & x_1 & x_1^2 & \dots & x_1^p \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & x_p & x_p^2 & \dots & x_p^p \end{pmatrix} \begin{pmatrix} A_0 \\ A_1 \\ \vdots \\ A_p \end{pmatrix}\] \[\begin{pmatrix} A_0 \\ A_1 \\ \vdots \\ A_p \end{pmatrix} = \begin{pmatrix} 1 & x_0 & x_0^2 & \dots & x_0^p \\ 1 & x_1 & x_1^2 & \dots & x_1^p \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & x_p & x_p^2 & \dots & x_p^p \end{pmatrix}^{-1} \begin{pmatrix} P_A(x_0) \\ P_A(x_1) \\ \vdots \\ P_A(x_p) \end{pmatrix}\]Thus, for fixed input values $x_0, \dots, x_p$, this switch between the two kinds of representations is done in linear time.
Strategy
- Given two polynomials of degree at most $p$, convert them into value representation at $2p + 1$ distinct points $x_0, x_1, \dots, x_{2p}$:
- Multiply these two polynomials point-wise, using $2p + 1$ multiplications only:
- Convert such value representation of $P_C(x) = P_A(x) P_B(x)$ back to coefficient form
- What values should we choose for $x_0, x_1, \dots, x_{2p}$?
- Use $2p + 1$ smallest possible integer values ${ -p, -(p - 1), \dots, -1, 0, \dots, p - 1, p }$
- We find the values $P_A(m)$ and $P_B(m)$ for all $m$ such that $-p \le m \le p$
- Multiplication of a large number with $k$ bits by a constant integer $d$ can be done in time linear in $k$ because it is reducible to $d - 1$ additions: $d \cdot A = \underbrace{A + A + \dots + A}_{d}$
-
Thus, all the values \(\begin{array}{lclcl} P_A(m) & = & A_p m^p + A_{p - 1} m^{p - 1} + \dots + A_0 & : & -p \le m \le p \\ P_B(m) & = & B_p m^p + B_{p - 1} m^{p - 1} + \dots + B_0 & : & -p \le m \le p \end{array}\) can be found in time linear in the number of bits of the input numbers
- We now perform $2p + 1$ multiplications of large numbers to obtain \(P_A(-p) P_B(-p), \dots, P_A(-1) P_B(-1), P_A(0) P_B(0), P_A(1) P_B(1), \dots, P_A(p) P_B(p)\)
- For $P_C(x) = P_A(x) P_B(x)$ these products are $2p + 1$ many values of $P_C(x)$: \(P_C(-p) = P_A(-p) P_B(-p), \dots, P_C(0) = P_A(0) P_B(0), \dots, P_C(p) = P_A(p) P_B(p)\)
- Let $C_0, C_1, \dots, C_{2p}$ be the coefficients of the product polynomial $C(x)$, i.e. let \(P_C(x) = C_{2p} x^{2p} + C_{2p - 1} x^{2p - 1} + \dots + C_0\)
- We now have \(\begin{array}{l} C_{2p}(-p)^{2p} + C_{2p - 1}(-p)^{2p - 1} + \dots + C_0 = P_C(-p) \\ C_{2p}(-(p - 1))^{2p} + C_{2p - 1}(-(p - 1))^{2p - 1} + \dots + C_0 = P_C(-(p - 1)) \\ \qquad \vdots \\ C_{2p}(p - 1)^{2p} + C_{2p - 1}(p - 1)^{2p - 1} + \dots + C_0 = P_C(p - 1) \\ C_{2p}(p)^{2p} + C_{2p - 1}(p)^{2p - 1} + \dots + C_0 = P_C(p) \end{array}\)
This is just a system of linear equations, that can be solved for $C_0, C_1, \dots, C_{2p}$. We can apply the inverse Vandermonde matrix as described earlier, thus the coefficients $C_i$ can be obtained in linear time.
The Fast Fourier Transform
Complex Numbers
Complex numbers $z = a + ib$ can be represented using their modulus $|z| = \sqrt{a^2 + b^2}$ and their argument, $\arg(z)$, which is an angle taking values in $(-\pi, \pi]$ and satisfying
\[z = \|z\| e^{i \arg(z)} = \|z\| (\cos \arg(z) + i \sin \arg(z))\]Recall that
\[z^n = \left(\|z\| e^{i \arg(z)}\right)^n = \|z\|^n e^{i n \arg(z)} = \|z\|^n(\cos(n\arg(z)) + i \sin(n \arg(z)))\]Complex Roots Of Unity
Roots of unity of order $n$ are complex numbers which satisfy $z^n = 1$.
If $z^n = |z|^n (\cos(n \arg(z)) + i \sin(n \arg(z))) = 1$ then $|z| = 1$ and $n \arg(z)$ is a multiple of $2 \pi$.
Thus, $n \arg(z) = 2 \pi k$, i.e. $\arg(z) = \dfrac{2\pi k}{n}$.
We denote $\omega_n = e^{i 2\pi / n}$. Such $\omega_n$ is called a primitive root of unity of order $n$. A root of unity $\omega$ of order $n$ is primitive is all other roots of unity of the same order can be obtained as its powers $\omega^k$.
We have $\omega_n = e^{i 2\pi/n} \equiv \cos \dfrac{2\pi}{n} + i \sin \dfrac{2\pi}{n}$.
For $\omega_n = e^{i 2\pi / n}$ and for all $k$ such that $0 \le k \le n - 1$,
\[((\omega_n)^k)^n = (\omega_n)^{nk} = ((\omega_n)^n)^k = 1^k = 1\]Thus, $\omega_n^k = (\omega_n)^k$ is also a root of unity.
Since $\omega_n^k$ are roots of unity for $k = 0, 1, \dots, n - 1$ and there are at most $n$ distinct roots of unity of order $n$, we conclude that every root of unity of order $n$ must be of the form $\omega_n^k$.
For the product of any two roots of unity $\omega_n^k$ and $\omega_n^m$ of the same order, we have $\omega_n^k \omega_n^m = \omega_n^{k + m}$.
If $k + m \ge n$ then $k + m = n + l$ for $l = (k + m) \mod n$, and we have $\omega_n^k = \omega_n^m = \omega_n^{k + m} = \omega_n^{n + l} = \omega_n^n \omega_n^l = 1 \cdot \omega_n^l = \omega_n^l$ where $0 \le l < n$.
Thus, the product of any two roots of unity of the same order is just another root of unity of the same order.
So in the set of all roots of unity of order $n$, i.e. ${ 1, \omega_n, \omega_n^2, \dots, \omega_n^{n - 1} }$ we can multiply any two elements or raise an element to any power without going out of this set.
Cancellation Lemma
The cancellation lemma states that $\omega_{kn}^{km} = \omega_n^m$ for all integers $k, m, n$.
The Discrete Fourier Transform
Let $A = \langle A_0, A_1, \dots, A_{n - 1} \rangle$ be a sequence of $n$ real or complex numbers.
We can form the corresponding polynomial $P_A(x) = \sum_{j = 0}^{n - 1} A_j x^j$ and evaluate it at all complex roots of unity of order $n$, i.e. we compute $P_A(\omega_n^l)$ for all $0 \le k \le n - 1$.
The sequence of values $\langle P_A(1), P_A(\omega_n), P_A(\omega_n^2), \dots, P_A(\omega_n^{n - 1}) \rangle$ is called the Discrete Fourier Transform (DFT) of the sequence $A$.
The value $P_A(\omega_n^k)$ is usually denoted by $\hat{A}_k$ and the sequence of values $\langle P_A(1), P_A(\omega_n), P_A(\omega_n^2), \dots, P_A(\omega_n^{n - 1}) \rangle$ is usually denoted by $\hat{A} = \langle \hat{A}_0, \hat{A}_1, \dots, \hat{A}_{n - 1} \rangle$.
The DFT $\hat{A}$ of a sequence $A$ can be computed very fast using a divide-and-conquer algorithm called the Fast Fourier Transform.
To multiply polynomials using the Discrete Fourier Transform we:
- Define two polynomials \(\begin{array}{lcl} P_A(x) & = & A_0 + \dots + A_{n - 1}x^{n - 1} \\ P_B(x) & = & B_0 + \dots + B_{m - 1}x^{m - 1} \\ \end{array}\)
- Pad $A$ with $m - 1$ zeros at the end, $(A_0, A_1, \dots, A_{n - 1}, \underbrace{0, \dots, 0}_{m - 1})$ to make it of length $m + n - 1$, and similarly pad $B$ with $n - 1$ zeros at the end, $(B_0, B_1, \dots, B_{m - 1}, \underbrace{0, \dots, 0}_{n - 1})$ to also obtain a sequence of length $n + m - 1$
- Compute the DFT of both $P_A$ and $P_B$ \(\begin{array}{lcl} DFT(\langle A_0, A_1, \dots, A_{n - 1}, \underbrace{0, \dots, 0}_{m - 1}) \rangle & = & \langle \hat{A}_0, \hat{A}_1, \dots, \hat{A}_{n + m - 2}) \\ DFT(\langle B_0, B_1, \dots, B_{m - 1}, \underbrace{0, \dots, 0}_{n - 1}) \rangle & = & \langle \hat{B}_0, \hat{B}_1, \dots, \hat{B}_{n + m - 2}) \\ \end{array}\)
- For each $k$, multiply the corresponding values $\hat{A}_k = P_A(\omega_{n + m - 1}^k)$ and $\hat{B}_k = P_B(\omega_{n + m - 1}^k)$ thus obtaining \(\hat{C}_k = \hat{A}_k \hat{B}_k = P_A(\omega_{n + m - 1}^k) P_B(\omega_{n + m - 1}^k) = P_C(\omega_{n + m - 1}^k)\)
- Use the inverse transformation for DFT, called IDFT, to recover the coefficients $\langle C_0, C_1, \dots, C_{n + m - 1} \rangle$ of the product polynomial $P_C(x)$ from the sequence $\langle \hat{C}_0, \hat{C}_1, \dots, \hat{C}_{n + m - 1} \rangle$ of its values $C_k = P_C(\omega_{n + m - 1}^k)$ at the roots of unity of order $n + m - 1$
The Fast Fourier Transform
Given a sequence $A = \langle A_0, A_1, \dots, A_n \rangle$ we want to compute its DFT. This amounts to finding values of $P_A(x)$ for all $x = \omega_n^k, 0 \le k \le n - 1$.
The main idea of the Fast Fourier Transform (FFT) algorithm is to divide-and-conquer by splitting the polynomial $P_A(x)$ into the even powers and the odd powers:
\[\begin{array}{lcl} P_A(x) & = & (A_0 + A_2 x^2 + A_4 x^2 + \dots + A_{n - 2} x^{n - 2}) + (A_1 x + A_3 x^3 + \dots + A_{n - 1} x^{n - 1}) \\ & = & A_0 + A_2 x^2 + A_4 (x^2)^2 + \dots + A_{n - 2} (x^2)^{n/2 - 1} + x(A_1 + A_3 x^2 + A_5 (x^2)^2 + \dots + A_{n - 1} (x^2)^{n/2 - 1}) \end{array}\]We then define
\[A^{[0]} = \underbrace{\langle A_0, A_2, A_4, \dots, A_{n - 2} \rangle}_{\text{even coefficients of $P_A(x)$}} \qquad A^{[1]} = \underbrace{\langle A_1, A_3, A_5, \dots, A_{n - 1} \rangle}_{\text{odd coefficients of $P_A(x)$}}\]and construct the polynomials
\[\begin{array}{lcl} P_{A_{[0]}}(y) & = & A_0 + A_2 y + A_4 y^2 + \dots + A_{n - 1} y^{n / 2 - 1} \\ P_{A_{[1]}}(y) & = & A_1 + A_3 y + A_5 y^2 + \dots + A_{n - 1} y^{n / 2 - 1} \\ \end{array}\]so $P_A(x) = P_{A^{[0]}}(x^2) + x P_{A^{[1]}}(x^2)$.
Note that the number of coefficients of the polynomials $P_{A^{[0]}}(y)$ and $P_{A^{[1]}}(y)$ is $n/2$ each, while the number of coefficients of the polynomial $P_A(x)$ is $n$.
As such, we have reduced evaluation of our polynomial $P_A(x)$ with $n$ coefficients at inputs $x = \omega_n^0, x = \omega_n^1, x = \omega_n^2, \dots, x = \omega_n^{n - 1}$ to evaluation of two polynomials $P_{A^{[0]}}(y)$ and $P_{A^{[1]}}(y)$ each with $n/2$ coefficients, at points $y = x^2$ for the same values of inputs $x$.
However, as $x$ ranges through values ${ \omega_n^0, \omega_n^1, \dots, \omega_n^{n - 1} }$, the value of $y = x^2$ ranges through ${ \omega_{n/2}^0, \omega_{n/2}^2, \omega_{n/2}^2, \dots, \omega_{n/2}^{n - 1} }$, and there are only $n/2$ distinct such values.
Once we get these $n/2$ values of $A^{[0]}(x^2)$ and $A^{[1]}(x^2)$ we need $n$ additional multiplications with numbers $\omega_n^k$ to obtain the values of
\[\begin{array}{lcl} P_A(\omega_n^k) & = & P_{A^{[0]}}((\omega_n^k)^2) + \omega_n^k \cdot P_{A^{[1]}}((\omega_n^k)^2) \\ & = & P_{A^{[0]}}(\omega_{n/2}^k) + \omega_n^k \cdot P_{A^{[1]}}(\omega_{n/2}^k) \\ \end{array}\]Thus, we have reduced a problem of size $n$ to two such problems of size $n/2$, plus a linear overhead.
A Simplification
By the Cancelation Lemma, $\omega_n^{n/2} = \omega_{2n/2}^{n/2} = \omega_2 = -1$.
Thus, $\omega_n^{k + n/2} = \omega_n^{n/2} \omega_n^k = \omega_2 \omega_n^k = -\omega_n^k$.
We can now simplify evaluation of $P_A(\omega_n^k) = P_{A^{[0]}}(\omega_{n/2}^k) + \omega_n^k P_{A^{[1]}}(\omega_{n/1}^k)$ for $n/2 \le k < n$ as follows:
- Let $k = n/2 + m$ where $0 \le m < n/2$
- Then: \(\begin{array}{lcl} P_A(\omega_n^{n/2 + m} & = & P_{A^{[0]}}(\omega_{n/2}^{n/2+m}) + \omega_n^{n/2 + m} P_{A^{[1]}}(\omega_{n/2}^{n/2 + m}) \\ & = & P_{A^{[0]}}(\omega_{n/2}^{n/2} \omega_{n/2}^m) + \omega_n^{n/2} \omega_n^m P_{A^{[1]}}(\omega_{n/2}^{n/2} \omega_{n/2}^m) \\ & = & P_{A^{[0]}}(\omega_{n/2}^m) - \omega_n^m P_{A^{[1]}}(\omega_{n/2}^m) \\ \end{array}\)
Compare this with $P_A(\omega_n^m) = P_{A^{[0]}}(\omega_{n/2}^m) + \omega_n^m P_{A^{[1]}}(\omega_{n/2}^m)$ for $0 \le m < n/2$.
So we can replace evaluations of $P_A(\omega_n^k) = P_{A^{[0]}}(\omega_{n/2}^k) + \omega_n^k P_{A^{[1]}}(\omega_{n/2}^k)$ for $k = 0$ to $k = n - 1$, with such evaluations only for $k = 0$ to $k = n/2 - 1$, and just let for $k = 0$ to $k = n/2 - 1$
\[\begin{array}{lcl} P_A(\omega_n^k) & = & P_{A^{[0]}}(\omega_{n/2}^k) + \omega_n^k P_{A^{[1]}}(\omega_{n/1}^k) \\ P_A(\omega_n^{n/2 + k}) & = & P_{A^{[0]}}(\omega_{n/2}^k) - \omega_n^k P_{A^{[1]}}(\omega_{n/1}^k) \\ \end{array}\]The FFT runs in $\Theta(n \log n)$ time.
Computing Convolutions
To compute the convolution $C = A \ast B$, proceed in exactly the same manner as the FFT, except this time, start with sequences of $A = \langle A_0, \dots, A_{n - 1} \rangle$ and $B = \langle B_0, \dots, B_{n - 1} \rangle$.
That is:
- Convert sequences to polynomial form $P_A(x)$ and $P_B(x)$ (in $\mathcal{O}(n)$ time)
- Compute the DFT of $P_A(x)$ and $P_B(x)$ (in $\mathcal{O}(n \log n)$ time)
- Multiply the DFT sequences together (in $\mathcal{O}(n)$ time)
- Perform the IDFT on the result from (3) (in $\mathcal{O}(n \log n)$ time)
Overall, the convolution $C = A \ast B$ can be computed in $\mathcal{O}(n \log n)$ time.
Greedy Algorithms
A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage.
Activity Selection
Problem: Given a list of activities $a_i$ ($1 \le a \le n$) with starting times $s_i$ and finishing times $f_i$ where no two activities can take place simultaneously, find a maximum size subset of compatible activities.
Solution: Among the activities which do not conflict with the previously chosen activities always chose the one with the earliest end time.
To prove optimality of this greedy solution we show that any optimal solution can be transformed into the greedy solution with equal number of activities:
- Find the first place where the chosen activity violates the greedy choice
- Show that replacing that activity with the greedy choice produces a non-conflicting selection with the same number of activities
- Continue in this manner till you “morph” your optimal solution into the greedy solution, thus proving the greedy solution is also optimal
Minimising Job Lateness
Problem: We are given a start time $T_0$ and a list of jobs $a_i$ ($1 \le i \le n$), with duration times $t_i$ and deadlines $d_i$. Only one job can be performed at any time; all jobs have to be completed. If a job $a_i$ is completed at a finishing time $f_i > d_i$, then we say that it has incurred lateness $l_i = f_i - d_i$. Schedule all the jobs so that the lateness of the job with the largest lateness is minimised.
Solution: Ignore job durations and schedule jobs in the increasing order of deadlines.
Tape Storage
Problem: We are given a list of $n$ files $f_i$ of lengths $l_i$ which have to be stored on a tape. Each file is equally likely to be needed. To retrieve a file, one must start from the beginning of the tape and scan it until the file is found and read. Order the files on the tape so that the average (expected) retrieval time is minimised.
Solution: If the files are stored in order $l_1, l_2, \dots, l_n$ then the expected time is proportional to
\[l_1 + (l_1 + l_2) + (l_1 + l_2 + l_3) + \dots + (l_1 + l_2 + l_3 + \dots + l_n) = n l_1 + (n - 1) l_2 + (n - 2) l_3 + \dots + 2 l_{n - 1} + l_n\]This is minimised if $l_1 \le l_2 \le l_3 \dots l_n$.
Kruskal’s Algorithm
Kruskal’s algorithm is a minimum-spanning-tree algorithm which finds an edge of the least possible weight that connects any two trees in the forest.
- We order the edges $E$ in a non-decreasing order with respect to their weights
- We build a tree by adding edges, one at each step of construction
- An edge $e_i$ is added at a round $i$ of construction whenever its addition does not introduce a cycle in the graph constructed thus far
- If adding an edge does introduce a cycle, that edge is discarded
- The process terminates when the list of all edges has been exhausted
$k$-clustering Of Maximum Spacing
Problem: Given a complete graph $G$ with weighted edges representing distances between the two vertices, partition the vertices of $G$ into $k$ disjoint subsets so that the minimal distance between two points belonging to different sets of the partition is as large as possible. Thus, we want a partition into $k$ disjoint sets which are as far apart as possible.
Solution: Sort the edges in an increasing order and start performing the usual Kruskal’s algorithm for building a minimal spanning tree, but stop when you obtain $k$ connected components, rather than a single spanning tree.
Dynamic Programming
The main idea of Dynamic Programming is to build an optimal solution to a problem from optimal solutions for (carefully chosen) smaller size subproblems.
Subproblems are chosen in a way which allows recursive construction of optimal solutions to such subproblems from optimal solutions to smaller size subproblems.
Efficiency of DP comes from the fact that that the sets of subproblems needed to solve larger problems heavily overlap; each subproblem is solved only once and its solution is stored in a table for multiple use for solving larger problems.
Activity Selection
Problem: Given a list of activities $a_i, 1 \le i \le n$ with starting times $s_i$ and finishing times $f_i$ where no two activities can take place simultaneously, find a subset of compatible activities of maximal total duration.
Solution: We start by sorting these activities by their finishing time into a non-decreasing sequence, so will assume that $f_1 \le f_2 \le \dots \le f_n$.
For every $i \le n$ we solve the following subproblem $P(i)$:
- Find a subsequence $\sigma_i$ of the sequence of activities $S_i = \langle a_1, a_2, \dots, a_i \rangle$ such that:
- $\sigma_i$ consists of non-overlapping activities
- $\sigma_i$ ends with activity $a_i$
- $\sigma_i$ of maximal total duration among all subsequences of $S_i$ which satisfy (1) and (2)
Let $T(i)$ be the total duration of the optimal solution $S(i)$ of the subproblem $P(i)$. For $S(1)$ we choose $a_1$, thus $T(1) = f_1 - s_1$.
Recursion: assuming that we have solved subproblems for all $j < i$ and stored them in a table, we let
\[T(i) = \max \{ T(j) + f_i - s_i : j < i \land f_j < s_i \}\]In the table, besides $T(i)$ we also store $j$ for which the above max is achieved.
We now let $T_{max} = \max { T(i) : i \le n }$.
We can now reconstruct the optimal sequence which solves our problem from the table of partial solutions, because in the $i$th slot of the table, besides $T(i)$ we also store $j$ such that the optimal solution of $P(i)$ extends the optimal solution of subproblem $P(j)$.
Longest Increasing Subsequence
Problem: Given a sequence of $n$ real number $A[1..n]$, determine a subsequence (not necessarily contiguous) of maximum length in which the values in the subsequence are strictly increasing.
Solution: For each $i \le n$ we solve the following subproblem $P(i)$:
- Find a subsequence of the sequence $A[1..i]$ of maximum length in which the values are strictly increasing and which ends with $A[i]$.
Recursion: Assume we have solved all the subproblems for $j < i$. We now look for all $A[m]$ such that $m < i$ and such that $A[m] < A[i]$. Among those we pick $m$ which produced the longest increasing subsequence ending with $A[m]$ and extend it with $A[i]$ to obtain the longest increasing subsequence which ends with $A[i]$.
Integer Knapsack Problem (Duplicate Items Allowed)
Problem: You have $n$ types of items, all items of kind $i$ are identical and of weight $w_i$ and value $v_i$. You also have a knapsack of capacity $C$. Choose a combination of available items which all fit in the knapsack and whose value is as large as possible. You can take any number of items of each kind.
Solution: DP recursion on the capacity $C$ of the knapsack. We build a table of optimal solutions for all knapsacks of capacities $i \le C$.
Assume we have solved the problem for all knapsacks of capacities $j < i$. We now look at optimal solutions $opc(i - w_m)$ for all knapsacks of capacities $i - w_m$ for all $1 \le m \le n$. Choose the one for which $opt(i - w_m) + v_m$ is the largest.
Add to such optimal solution for the knapsack of size $i - w_m$ item $m$ to obtain a packing of a knapsack of size $i$ of the highest possible value.
Thus, $opt(i) = \max { opt(i - w_m) + v_m : 1 \le m \le n }$. After $C$ many steps we obtain $opt(C)$ which is what we need.
Integer Knapsack Problem (Duplicate Items Not Allowed)
Problem: You have $n$ items (some of which can be identical); item $I_i$ is of weight $w_i$ and value $v_i$. You also have a knapsack of capacity $C$. Choose a combination of available items which all fit in the knapsack and whose value is as large as possible.
Solution: This is an example of a “2D” recursion. We will be filling a table of size $n \times C$, row by row. Subproblems $P(i, c)$ for all $i \le n$ and $c \le C$ will be of the form: choose from items $I_1, I_2, \dots, I_i$ a subset which fits in a knapsack of capacity $c$ and is of the largest possible total value.
Fix now $i \le n$ and $c \le C$ and assume we have solved the subproblems for:
- All $j < i$ and all knapsacks of capacities from 1 to $C$
- For $i$ we have solved the problems for all capacities $d < c$
We now have two options: either we take item $I_i$ or we do now. So we look at the optimal solutions $opt(i - 1, c - w_i)$ and $opt(i - 1, c)$.
If $opt(i - 1, c - w_i) + v_i > opt(i - 1, c)$ then we set $opt(i, c) = opt(i - 1, c - w_i) + v_i$, else we set $opt(i, c) = opt(i - 1, c)$.
The final solution will be given by $opt(n, C)$.
Bellman Ford Algorithm
Problem: Given a directed weighted graph $G = (V, E)$ with weights which can be negative, but without cycles of negative total weight and a vertex $s \in V$, find the shortest path from vertex $s$ to every other vertex $t$.
Solution: Since there are no negative weight cycles, the shortest path cannot contain cycles, because a cycle can be excised to produce a shorter path. Thus, every shortest path can have at most $|V| - 1$ edges.
Subproblem: For every $v \in V$ and every $i, (1 \le i \le n - 1)$, let $opt(i, v)$ be the length of a shortest path from $s$ to $v$ which contains at most $i$ edges.
Our goal is to find for every vertex $t \in G$ the value of $opt(n - 1, t)$ and the path which achieves such a length.
Let us denote the length of the shortest path from $s$ to $v$ among all paths which contain at most $i$ edges by $opt(i, v)$, and let $pred(i, v)$ be the immediate predecessor of vertex $v$ on such shortest path.
Recursion:
\[opt(i, v) = \min(opt(i - 1, v), \min_{p \in V} \{ opt(i - 1, p) + w(e(p, v)) \})\] \[pred(i, v) = \begin{cases} pred(i - 1, v) & \text{if} \ \min_{p \in V} \{ opt(i - 1, p) + w(e(p, v)) \} \ge pred(i - 1, v) \\ \arg \min_{p \in V} \{ opt(i - 1, p) + w(e(p, v)) \} & \text{otherwise} \end{cases}\]Here $w(e(p, v))$ is the weight of the edge $e(p, v)$ from vertex $p$ to vertex $v$.
This algorithm produces shortest paths from $s$ to every other vertex in the graph.
The method employed is sometimes called “relaxation”, because we progressively relax the additional constraint on how many edges the shortest paths can contain.
Floyd Warshall Algorithm
Problem: Let $G = (V, E)$ be a directed weighted graph where $V = { v_1, v_2, \dots, v_n }$ and where weights $w(e(v_p, v_q))$ of edges $e(v_p, v_q)$ can be negative, but there are no negative weight cycles.
Solution: we can use a somewhat similar idea to obtain the shortest paths from every vertex $v_p$ to every vertex $v_q$ (including back to $v_p$).
Let $opt(k, v_p, v_q)$ be the length of the shortest path from a vertex $v_p$ to a vertex $v_q$ such that all intermediate vertices are among vertices ${ v_1, v_2, \dots, v_k }, (1 \le k \le n)$.
Then $opt(k, v_p, v_q) = \min { opt(k - 1, v_p, v_q), opt(k - 1, v_p, v_k) + opt(k - 1, v_k, v_q) }$.
Thus, we gradually relax the constraint that the intermediary vertices have to belong to ${ v_1, v_2, \dots, v_k }$.
Maximum Flow
Flow Networks
A flow network $G = (V, E)$ is a directed graph in which each edge $e = (u, v) \in E$ has a positive integer capacity $c(u, v) > 0$. There are two distinguished vertices: a source $s$ and a sink $t$; no edge leaves the sink and no edge enters the source.
A flow in $G$ is a function $f : E \to \mathbb{R}^+$, $f(u, v) \ge 0$, which satisfies
- Capacity constraint: for all edges $e(u, v) \in E$ we require $f(u, v) \le c(u, v)$
- Flow conservation: for all $v \in V - { s, t }$ we require $\sum_{(u, v) \in E} f(u, v) = \sum_{(v, w) \in E} f(v, w)$
The value of the flow is defined as $|f| = \sum_{v: (s, v) \in E} f(s, v)$.
Residual Flow Networks
The residual flow network for a flow network with some flow in it is the network with the leftover capacities. Each edge of the original network has a leftover capacity for more flow equal to the capacity of the edge minus the flow through the edge.
If the flow through an edge is equal to the capacity of the edge, this edge disappears in the residual network.
New “virtual” edges appear in opposite direction of an original edge with some flow in it (unless there were already an edge in the opposite direction). They represent the possibility to reduce the flow through the original edge; thus their capacity is equal to the flow through the original edge (or, if there were already an edge in the opposite direction, the capacity of such an edge is increased for the amount of that flow; see vertices $v_1$ and $v_2$).
Residual flow networks can be used to increase the total flow through the network by adding an augmenting path. The capacity of an augmenting path is the capacity of its “bottleneck” edge, i.e. the capacity of the smallest capacity edge on that path.
We can now recalculate the flow through all edges along the augmenting path by adding the additional flow through the path if the flow through the augmenting path is in the same direction as the original flow, and subtracting if in opposite direction.
Ford Fulkerson Algorithm
The Ford Fulkerson algorithm is used to find maximal flow in a flow network. We keep adding flow through new augmenting paths for as long as it is possible. When there are no more augmenting paths, we have achieved the largest possible flow in the network.
Cuts
A cut in a flow network is any partition of the vertices of the underlying graph into two subsets $S$ and $T$ such that:
- $S \cup T = V$
- $S \cap T = \emptyset$
- $s \in S \land t \in T$
The capacity $c(S, T)$ of a cut $(S, T)$ is the sum of capacities of all edges leaving $S$ and entering $T$, i.e.
\[c(S, T) = \sum_{(u, v) \in E} \{ c(u, v) : u \in S \land v \in T \}\]Note that the capacities of edges going in the opposite direction, i.e. from $T$ to $S$ do not count.
The flow through a cut $f(S, T)$ is the total flow through edges from $S$ to $T$ minus the total flow through edges from $T$ to $S$:
\[f(S, T) = \sum_{(u, v) \in E} \{ f(u, v) : u \in S \land v \in T \} - \sum_{(u, v) \in E} \{ f(u, v) : u \in T \land v \in S \}\]Clearly $f(S, T) \le c(S, T)$ because for every edge $(u, v) \in E$ we assumed $f(u, v) \le c(u, v)$, and $f(u, v) \ge 0$.
The maximal amount of flow in a flow network is equal to the capacity of the cut of minimal capacity.
Edmonds-Karp Max Flow Algorithm
The Edmonds-Karp algorithm improves the Ford Fulkerson algorithm in a simple way: always choose the shortest path from the source $s$ to the sink $t$, where the “shortest path” means the fewest number of edges, regardless of their capacities (i.e. each edge has the same unit weight).
Bipartite Graphs
A bipartite graph is a graph whose vertices can be split into two subsets, $L$ and $R$ such that every edge $e \in E$ has one end in the set $L$ and the other in the set $R$.
A matching in a graph $G$ is a subset $M$ of all edges $E$ such that each vertex of the graph belongs to at most one of the edges in the matching $M$.
A maximum matching in a bipartite graph $G$ is a matching containing the largest possible number of edges.
Max Flow With Vertex Capacities
Sometimes not only the edges but also the vertices $v_i$ of the flow graph might have capacities $C(v_i)$, which limit the total throughput of the flow coming to the vertex (and, consequently, also leaving the vertex):
\[\sum_{e(u, v) \in E} f(u, v) = \sum_{e(v, w) \in E} f(v, w) \le C(v)\]Such case is reduced to the case where only edges have capacities by splitting each vertex $v$ with limited capacity $C(v)$ into two vertices $v_{in}$ and $v_{out}$ so that all edges coming into $v$ go into $v_{in}$, all edges leaving $v$ now leave $v_{out}$.
String Matching Algorithms
Assume that you want to find out if a string $B = b_0 b_1 \dots b_{m - 1}$ appears as a (contiguous) substring of a much longer string $A = a_0 a_1 \dots a_{n - 1}$. The “naive” string matching algorithm does not work well if $B$ is much longer than what can fit in a single register; we need something cleverer.
Rabin-Karp Algorithm
This algorithm involves computing a hash value for the string $B$.
We assume that the strings $A$ and $B$ are in an alphabet $\mathcal{A}$ with $d$ many symbols in total. Thus, we can identify each string with a sequence of intergers by mapping each symbol $s_i$ into a corresponding integer $i$:
\[\mathcal{A} = \{ s_0, s_1, s_2, \dots, s_{d - 1} \} \rightarrow \{ 0, 1, 2, \dots, d - 1 \}\]For any string $B$ we can now associate an integer whose digits in base $d$ are integers corresponding to each symbol in $B$:
\[h(B) = h(b_0, b_1, b_2 \dots b_m) = d^{m - 1} b_0 + d^{m - 2} b_1 + \dots + d \cdot b_{m - 1} + b_{m - 1}\]This can be done efficiently using Horner’s rule:
\[h(B) = b_{m - 1} + d(b_{m - 2} + d(b_{m - 3} + d(b_{m - 4} + \dots + d(b_1 + d \cdot b_0))) \dots )\]Next, we choose a large prime number $p$ such that $(d + 1)p$ still fits into a single register, and define the hash value of $B$ as $H(B) = h(B) \mod p$.
Recall that $A = a_0 a_1 a_2 a_3 \dots a_s a_{s + 1} \dots a_{s + m - 1} \dots a_{N - 1}$ where $N » m$. We want to find efficiently all $s$ such that the string of length $m$ of the form $a_s a_{s + 1} \dots a_{s + m - 1}$ and string $b_0 b_1 \dots b_{m - 1}$ are equal.
For each contiguous substring $A_s = a_s a_{s + 1} \dots a_{s + m - 1}$ of string $A$ we also compute its hash value as:
\[h(a_S) = (d^{m - 1} a_s + d^{m - 2} a_{s + 1} + \dots + d^1 a_{s + m - 2} + a_{s + m - 1}) \mod p\]We can now compare the hash values $h(B)$ and $h(A_s)$ and do a symbol by symbol matching only if $h(B) = h(A_s)$.
Clearly, such an algorithm would be faster than the naive symbol-by-symbol comparison only if we can compute the hash values of substrings $A_s$ faster than what it takes to compare strings $B$ and $A_s$ character by character.
This is where recursion comes into play. We do not have compute the hash value $h(A_s + 1)$ of $A_{s + 1} = a_{s + 1} a_{s + 2} \dots a_{s + m}$ from scratch, but we can compute it efficiently from the hash value $h(A_s)$ of $A_s$ as follows.
By multiplying both sides of $h(A_s)$ by $d$ we obtain:
\[\begin{array}{lcl} (d \cdot h(A_s)) \mod p & = & (d^m a_s + d^{m - 1} a_{s + 1} + \dots d \cdot a_{s + m - 1}) \mod p \\ & = & (d^m a_s + (d^{m - 1} a_{s + 1} + \dots d^2 a_{s + m - 2} + d a_{s + m - 1} + a_{s + m}) \mod p - a_{s + m}) \mod p \\ & = & (d^m a_s + h(A_{s + 1}) - a_{s + m}) \mod p \end{array}\]Consequently, $h(A_{s + 1}) = (d \cdot h(A_s) - d^m a_s + a_{s + m}) \mod p$.
Note that $(d^m a_s) \mod p = ((d^m \mod p) a_s) \mod p$, and that the value of $d^m \mod p$ can be precomputed and stored. Also, $(-d^m a_s + a_{s + m}) \mod p < p$.
Thus, since $h(A_s) < p$ we obtain $d \cdot h(A_s) + (-d^m a_s + a_{s + m}) \mod p < (d + 1)p$.
Thus, since we chose $p$ such that $(d + 1)p$ fits in a register, all the values and the intermediate results for the above expression also fit in a single register. So for every $s$ except $s = 0$, the value of $h(A_s)$ can be computed in constant time independent of the length of the strings $A$ and $B$.
So, we first compute $h(B)$ and $h(A_0)$ using Horner’s rule. Subsequent values of $h(A_s)$ for $s > 0$ are computed in constant time using the above recursion. $h(A_s)$ is compared with $h(B)$ and if they are equal then the strings $A_s$ and $B$ are compared by brute force character by character to see if they are also equal.
Since $p$ was chosen large, the false positivies when $h(A_s) = h(B)$ but $A_s \ne B$ are very unlikely, which makes the algorithm run fast in practice.
function RabinKarp(string s[1..n], string pattern[1..m])
hpattern := hash(pattern[1..m]);
for i from 1 to n-m+1
hs := hash(s[i..i+m-1])
if hs = hpattern
if s[i..i+m-1] = pattern[1..m]
return i
return not found
String Matching Finite Automata
A string matching finite automaton for a string $S$ with $k$ symbols has $k + 1$ many states $0, 1, \dots, k$ which correspond to the number of characters matched thus far and a transition function $\delta(s, c)$ where $s$ is a state and $c$ is a character read.
We first look at the case when such $\delta(s, c)$ is given by a pre-constructed table. For example, for the string $S = ababaca$, the table defining $\delta(s, c)$ would then be:
state | a | b | c | |
---|---|---|---|---|
0 | 1 | 0 | 0 | a |
1 | 1 | 2 | 0 | b |
2 | 3 | 0 | 0 | a |
3 | 1 | 4 | 0 | b |
4 | 5 | 0 | 0 | a |
5 | 1 | 4 | 6 | c |
6 | 7 | 0 | 0 | a |
7 | 1 | 2 | 0 |
To compute $\delta$, or fill in the table we let $B_k$ denote the prefix of the string $B$ consisting of the first $k$ characters of $B$.
If we are at a state $k$ this means that so far we have matched the prefix $B_k$; if we not see an input character $a$, then $\delta(k, a)$ is the largest $m$ such that the prefix $B_m$ of string $B$ is the suffix of the string $B_k a$
Thus, if $a$ happens to be $B[k + 1]$, then $m = k + 1$ and so $\delta(k, a) = k + 1$ and $B_k a = B_{k + 1}$.