Written by Luka Kerr on June 9, 2018

### Reversing List

1. Swap the head and tail pointers, and set current to the new head
2. Untill current isn’t NULL, swap next and previous values, and reset current to the new current->next (current->prev)
/*
L: NULL<-1-><-2-><-3-><-4-><-5->NULL
*/

// If list is empty
return;
}

L->tail = tmp;

while(current != NULL) {
tmp = current->next;
current->next = current->prev;
current->prev = tmp;
current = current->next;
}


### Removing Value From List

1. First check if list is null. If so, return
2. Next iterate over the list and check whether the current value is equal to the value to remove
3. If it is, we have 3 conditions:
1. The current value is the head node
• If there is a next node, set its previous pointer to the current’s previous node
• If there is not a next node, set the head and tail both to NULL (emptying the list, since there is only one node to be removed)
2. The current value is the tail node
• If there is a previous node, set its next pointer to the currents next node
• If there is not a previous node, set the head and tail both to NULL (emptying the list, since there is only one node to be removed)
3. The current value is somewhere in the middle
• Set the previous node’s next pointer to the current’s next node
• Set the next node’s previous pointer to the current’s previous node
4. Update the current node
/*
to_remove: 2
L: NULL<-1-><-2-><-3-><-2-><-5->NULL
*/

// If list is empty
return;
}

// Set current to the first node

// Iterate over the list
while(current != NULL) {
// Value is the same as the value to be removed
if (current->value == to_remove) {
if (current->next != NULL) {
current->next->prev = current->prev;
} else {
L->tail = NULL;
}
} else if (current == L->tail) {
if (current->prev != NULL) {
current->prev->next = current->next;
} else {
L->tail = NULL;
}
} else {
// If this condition is reached, there are at least a head and tail
// so the current node must be in the middle somewhere
current->prev->next = current->next;
current->next->prev = current->prev;
}
}
current = current->next;
}


## Sorting

Properties

• Stable sorting algorithms
• If elements being sorted have the same value, then the order of those elements in the unsorted list, is the same in the sorted list
• Elements with the same value aren’t necessarily kept in the same order after being sorted
• Two main classes of sorting algorithms
• $O(n^2)$
• $O(n \ log(n))$

Selection Sort

• Steps:
• Finds the smallest element and swaps it with the leftmost unsorted element
• Then moves the min index one to the right
• Time complexity:
• Best and worst: $O(n^2)$
• Is unstable
// a[] is the array to sort, n is the number of items in a[]
void selectionSort(int a[], int n) {
for (int i = 0; i < n-1; i++) {
int min = i;
for (int j = i+1; j < n; j++) {
if (a[j] < a[min]) {
min = j;
}
}
swap(a[i], a[min]);
}
}


Bubble Sort

• Time complexity:
• Best and worst: $O(n^2)$
• Is stable
// a[] is the array to sort, n is the number of items in a[]
void bubbleSort(int a[], int n) {
for (int i = 0; i < n; i++) {
for (int j = 0; j < n - 1; j++) {
if (a[j] > a[j+1]) {
swap(a[j], a[j+1]);
}
}
}
}


Insertion Sort

• Steps:
• Take first element and treat as sorted array of length 1
• Take next element and insert into sorted part of array so the order is preserved
• Repeat until array is sorted
• Time complexity:
• Best: $O(n)$
• Worst: $O(n^2)$
• Is stable
// a[] is the array to sort, n is the number of items in a[]
void insertionSort(int a[], int n) {
for (int i = 1; i < n; i++) {
int backup = a[i];
int j;

for (j = i; j > 0; j--) {
if (backup > a[j-1]) break;
a[j] = a[j-1];
}
a[j] = backup;
}
}


Shell Sort

• Time complexity:
• Best: depends on the sequence of hvals, has been found to be $O(n^{3/2})$ or $O(n^{4/3})$
• Worst: $O(n^2)$
• Is unstable
void shellSort(int a[], int lo, int hi) {
int hvals[8] = {701, 301, 132, 57, 23, 10, 4, 1};

for (int g = 0; g < 8; g++) {
int h = hvals[g];
int start = lo + h;
for (int i = start; i < hi; i++) {
int val = a[i];
for (int j = i; j >= start && less(val,a[j-h]); j -= h) {
move(a, j, j-h);
}
a[j] = val;
}
}
}


Quicksort

• Steps:
• Choose an item to be a “pivot”
• Partition the array so that
• All elements to the left of the pivot are smaller than the pivot
• All elements to the right of the pivot are greater than the pivot
• Recursively sort each of the partitions
• Time complexity:
• Best: $O(n \ log(n))$
• Worst: $O(n^2)$
• Is unstable
void quicksort(Item a[], int lo, int hi) {
int i; // index of pivot
if (hi <= lo) return;
i = partition(a, lo, hi);
quicksort(a, lo, i-1);
quicksort(a, i+1, hi);
}

int partition(Item a[], int lo, int hi) {
Item v = a[lo];  // pivot
int  i = lo+1, j = hi;
for (;;) {
while (less(a[i],v) && i < j) i++;
while (less(v,a[j]) && j > i) j--;
if (i == j) break;
swap(a,i,j);
}
j = less(a[i],v) ? i : i-1;
swap(a,lo,j);
return j;
}


## Generics

Function pointers are used to pass a function with parameters into another function

// takes in a pointer to a function that takes one parameter: a void *
void func(void *n, void (*fp) (void *)) {
// use the function, passing in "n" - a void *
fp(n)
}


An more concrete example:

// generic min function
void *min(void *el, void *el2, int (*compare) (void *, void *)) {
if (compare(el, el2)) {
return el;
}
return el2;
}

// compares two elements, takes in two void *
// converts them to integers and compares them
int cmp(void *item, void *item2) {
return (*(int*)item < *(int*)item2);
}

int main(void) {
int el = 5;
int el2 = 20;
int minimum = *(int*)min(&el, &el2, cmp);
printf("min = %d\n", minimum); // prints 5;
}


## Trees

### Tree Data Structures

#### Binary Search Tree

Properties

• Each node has at most 2 children
• All values in any left subtree are $<$ root, all values in any right subtree are $>$ root
• Time complexity: $O(log(n))$
typedef struct Node *BST;
typedef struct Node {
int data;
BST left, right;
} Node;


#### Splay Trees

Properties

• A binary tree that is self balancing by rotation-in-search, makes search more expensive
• Insertion occurs at root
• Time complexity: $O(n)$
• There are 4 cases for rotating:

#### AVL Trees

Properties

• A binary tree that is balanced by rotating on insertion
• Each node stores its height in the tree, used for maintaining balance
• Time complexity: $O(log(n))$

#### 2-3-4 Trees

Properties

• Three kinds of nodes
• 2-nodes, with two children
• 3-nodes, two items with three children
• 4-nodes, three items and four children
• Each child node is smaller than the one its right, similar to binary trees
• Time complexity is dependent on nodes
• 2-nodes (worst case): $O(log_2(n))$
• 3-nodes: $O(log_3(n))$
• 4-nodes (best case): $O(log_4(n))$
• Insertion
• If node not full (order < 4) insert
• If node is full
• Split into 2-nodes as leaves
• Make middle item parent of leaves
• Insert item into appropriate leaf
• If parent is a 4-node, continue this process
typedef struct Node {
int order; // 2, 3 or 4
int data[3]; // items in node
struct Node *child[4]; // links to children
} Node;


#### Red-Black Trees

Properties

• Structured like a binary seach tree, but each node has a color (red or black)
• No two red nodes appear consecutively on any path
• A red node is part of the same 2-3-4 node as its parent (sibling)
• A black node is a child of a 2-3-4 node containing the parent
• All paths from the root to a leaf have the same number of black nodes
• Time complexity: $O(log_2(n))$
• Equivalent trees (top 2-3-4, bottom red-black)

### Tree Algorithms

#### Counting Nodes

To count nodes, we can use recursion:

int count(BSTree t) {
if (t == NULL) {
return 0
} else {
return 1 + count(t->left) + count(t->right);
}
}


This algorithm can be modified to count different characteristics of the tree, such as number of odd nodes:

int countOdd(BSTree t) {
if (t == NULL) {
return 0
} else {
// if odd ret = 1, if even ret = 0
int ret = (t->data % 2);
return ret + countOdd(t->left) + countOdd(t->right);
}
}


#### Counting Leaves

To count the number of leaves, we can use recursion:

int BSTreeNumLeaves(BSTree t) {
// base case
if (t == NULL) {
return 0;
}
// if left and right are NULL, the node is a leaf
if ((t->left == NULL) && (t->right == NULL)) {
return 1;
}
int numleft = BSTreeNumLeaves(t->left);
int numright = BSTreeNumLeaves(t->right);
return (numleft + numright);
}


#### Finding A Node

To find a node, we can use recursion:

int BSTreeFind(BSTree t, int v) {
if (t == NULL) { // base case: value doesn't exist in tree
return 0;
} else if (v < t->value) { // value is less, search on the left hand side
return BSTreeFind(t->left, v);
} else if (v > t->value) { // value is more, search on the right hand side
return BSTreeFind(t->right, v);
} else // found the value
return 1;
}


### Tree Traversal

Infix Order (LNR)

Visit left, root, right

void BSTreeInfix(BSTree t) {
if (t == NULL) {
return;
}
BSTreeInfix(t->left);
showBSTreeNode(t);
BSTreeInfix(t->right);
}


Postfix Order (LRN)

Visit left, right, root

void BSTreePostfix(BSTree t) {
if (t == NULL) {
return;
}
BSTreePostfix(t->left);
BSTreePostfix(t->right);
showBSTreeNode(t);
}


Prefix Order (NLR)

Visit root, left, right

void BSTreePrefix(BSTree t) {
if (t == NULL) {
return;
}
showBSTreeNode(t);
BSTreePrefix(t->left);
BSTreePrefix(t->right);
}


### Tree Rotations

Left Rotation

function rotateLeft(Tree n1):
if n1 is empty or right(n1) is empty:
return n1

Tree n2 = right(n1)
right(n1) = left(n2)
left(n2) = n1
return n2


Right Rotation

function rotateRight(Tree n1):
if n1 is empty or left(n1) is empty:
return n1

Tree n2 = left(n1)
left(n1) = right(n2)
right(n2) = n1
return n2


## Graphs

### Graph Data Structures

Properties of a Graph

• Has $V$ - a set of vertices
• Has $E$ - a set of edges
• Has at most $\dfrac{V(V - 1)}{2}$ edges (excluding loops)
• Is dense if $E$ is large, is sparse if $E$ is small

Terminology

• If an edge $e$ connects $v$ and $w$, $v$ and $w$ are adjacent
• $e$ is the incident on both $v$ and $w$
• The degree of a vertex $v$ is the number of edges incident on $e$
• A path is a sequence of vertices where each vertex has an edge to its predecessor
• A cycle is where the last vertex in a path is the same as the first vertex in that path

• The length of a path/cycle is the number of edges in that path/cycle
• A connected graph is when there is a path from each vertex to every other vertex
• A complete graph is if there is an edge from each vertex to every other vertex (e.g. $K_{5}$)

• A tree is a connected graph/subgraph with no cycles
• A spanning tree is a tree containing all vertices of the graph
• A clique is a complete subgraph

• An undirected graph is when an edge from $v$ to $w$ is the same edge from $w$ to $v$, and there are no self loops
• A directed graph is when an edge from $v$ to $w$ is not the same edge from $w$ to $v$, and there can be self loops

• A weighted graph is when each edge has a weight
• A multi-graph allows multiple edges between two vertices

### Array of Edges

Properties

• Edges are represented as an array of Edge values (struct containing pairs of vertices)
• Space efficient
• Time complexity:
• Initialisation: $O(1)$
• Insert edge: $O(E)$
• Delete edge: $O(E)$
• Find edge: $O(log(E))$ (if edges maintained in order)

typedef struct GraphRep {
Edge *edges; // array of edges
int nV;      // #vertices (numbered 0..nV-1)
int nE;      // #edges
int n;       // size of edge array
} GraphRep;


Properties

• Edges represented by a $V \times V$ matrix
• If sparse, space inefficient
• Time complexity:
• Initialisation: $O(V^2)$
• Insert edge: $O(1)$
• Delete edge: $O(1)$

typedef struct GraphRep {
int nV;      // #vertices
int nE;      // #edges
} GraphRep;


Properties

• Space efficient
• Time complexity:
• Initialisation: $O(V)$
• Insert edge: $O(1)$
• Delete edge: $O(E)$

typedef struct GraphRep {
Node **edges;  // array of lists
int nV;        // #vertices
int nE;        // #edges
} GraphRep;

typedef struct Node {
Vertex v;
struct Node *next;
} Node;


### Graph Algorithms

#### Finding A Path

Steps:

• Examine vertices adjacent to src
• If any of them are dest, then done
• Otherwise try vertices two edges from v
• Repeat looking further and further from v

Can be achieved with a depth first search or breadth first search.

// array of visited
int *visited;

int DFS(Graph g, int v, int dest) {
// set the vertex to visited
visited[v] = 1;

// for each vertex in the graph
for (int w = 0; w < g->nV; w++) {
// if the given vertex (v) is adjacent to the current vertex (w)
// and it hasn't been visited
if (adjacent(g, v, w) && visited[w] == 0) {
// set it to visited
visited[w] = 1;

// if w is the dest, return true
if (w == dest) {
return 1;
} else if (DFS(g, w, dest)) {
return 1;
}
}
}
return 0;
}

int hasPath(Graph g, Vertex src, Vertex dest) {
// first check if src and dest are connected by an edge
return 1;
}

// sets all in visited to 0
visited = calloc(g->nV, sizeof(int));

// perform depth first search
int exists = DFS(g, src, dest);

free(visited);

return exists;
}


#### Find And Dispay Shortest Path

void findPath(Graph g, Vertex src, Vertex dest) {
// array of visited
int *visited = calloc(g->nV, sizeof(int));

// array of path predecessor vertices
Vertex *path = malloc(g->nV * sizeof(Vertex));

Queue q = newQueue();
QueueJoin(q, src);
visited[src] = 1;

int isFound = 0;

while (!emptyQ(q) && !isFound) {
Vertex y, x = QueueLeave(q);
for (y = 0; y < g->nV; y++) {
if (!g->edges[x][y]) {
continue;
}
path[y] = x;
if (y == dest) {
isFound = 1;
break;
}
if (!visited[y]) {
QueueJoin(q, y);
visited[y] = 1;
}
}
}

if (isFound) {
// display path in dest..src order
for (Vertex v = dest; v != src; v = path[v]) {
printf("%d<-", v);
}
sprintf("%d\n", src);
}
}


#### Dijkstra’s Algorithm

• Steps:
• Create a set containing every vertex of the graph
• Set all distances to $\infty$ and all predecessors to -1, set the source distance to 0
• Visit each vertex ($v$) from each other current vertex $u$ and:
• If the distance of $u$ + the distance from $u \to v$ is less than the distance of $v$, then:
• Update the distance of $v$
• Set the predecessor of $v$ to $u$
dist[]  // array of cost of shortest path from source
pred[]  // array of predecessor in shortest path from source

function Dijkstra(Graph, source):

vSet = all vertices of Graph

for each vertex v in Graph:
dist[v] = INFINITY
pred[v] = -1

dist[source] = 0

while vSet is not empty:
u = vertex in vSet with min dist[u]
remove u from vSet

for each neighbor v of u:
alt = dist[u] + length(u, v)
if alt < dist[v]:
dist[v] = alt
pred[v] = u

return dist[], pred[]


#### Hamilton Path

int *visited; // array of nV bools
int hamiltonPath(Graph g, Vertex v, Vertex w, int d) {
if (v == w) return (d == 0) ? 1 : 0;
visited[v] = 1;

for (int t = 0; t < g->nV; t++) {
if (!neighbours(g, v, t)) continue;
if (visited[v] == 1) continue;
if (hamiltonPath(g, t, w, d-1)) return 1;
}

visited[v] = 0;
return 0;
}

int hasHamiltonPath(Graph g, Vertex src, Vertex dest) {
visited = calloc(g->nV, sizeof(int));
int res = hamiltonPath(g, src, dest, g->nV-1);
free(visited);
return res;
}


#### Well Connected

• In the context of this problem, we will assume a vertex is ‘well connected’ if it has an edge to at least 2 other vertices
• We will implement this using an adjacency matrix, but the algorithm is similar to an adjacency list, except rather than using another for loop, just iterate over the list per vertex
/*
*/
int wellConnected(Graph g) {
// create an int[] that holds the count of connections per vertex
int *connections = calloc(g->nV, sizeof(int));

// iterate over each vertex, and if an edge between i and j exists, increment
// the connection count for i
for (int i = 0; i < g->nV; i++) {
for (int j = 0; j < g->nV; j++) {
if (g->edges[i][j]) {
connections[i]++;
}
}
}

// for each vertex, if the connection count is 2 or more, increment the total
// well connected count, and return it
int count = 0;
for (int i = 0; i < g->nV; i++) {
if (connections[i] >= 2) {
count++;
}
}
return count;
}

void depthFirst(Graph g, int src) {
int *visited = calloc(g->nV,sizeof(int));

// create a stack and add src to it
Stack s = newStack();
StackPush(s, src);

// while the stack is not empty, pop the top vertex off and visit it
while (!StackEmpty(s)) {
Vertex y, x = StackPop(s);

if (visited[x]) continue;
visited[x] = 1;

// push all of the neighbors onto the stack if not visited already
for (y = g->nV-1; y >= 0; y--) {
if (!g->edges[x][y]) continue;
if (!visited[y]) StackPush(s,y);
}
}
}

void breadthFirst(Graph g, Vertex src) {
int *visited = calloc(g->nV,sizeof(int));

// create a queue and add the src to it
Queue q = newQueue();
QueueJoin(q, src);

// while the queue is not empty, remove the FIFO element from the queue and visit it
while (!QueueEmpty(q)) {
Vertex y, x = QueueLeave(q);

if (visited[x]) continue;
visited[x] = 1;

// add each of the neighbors to the queue if not already visited
for (y = 0; y < g->nV; y++) {
if (!g->edges[x][y]) continue;
if (!visited[y]) QueueJoin(q,y);
}
}
}


## Heaps

Properties

• Similar to trees but with top-to-bottom ordering
• Heaps are dense trees, hence $depth = floor(log_2(n)) + 1$, where $n$ is the number of elements in the heap
• Time complexity:
• Insertion: $O(log(n))$
• Deletion: $O(log(n))$
• Often implemented using arrays where index 0 is empty, and elements begin at index 1
• Left child of item at index $i$ is located at $2i$
• Right child of item at index $i$ is located at $2i + 1$
• Parent of item at index $i$ is located at $i/2$
• Either max-heap
• Each node has to be greater than both of its children
• Or min-heap
• Each node has to be smaller than both of its children

Insertion

• Add new element at next available position
• Reorganise elements along path to root by value to restore heap order

Deletion

• Replace root element by bottom right element
• Remove bottom right element
• Reorganise elements along path from root to restore heap order

### Algorithms

Check if a binary tree is a heap

• A heap must be a complete tree (all levels except last should be full), and every nodes value should be greater than its children nodes (max-heap).
int isHeap(BTree t) {
if (t == null) {
return 1;
}

int leftHeap = isHeap(t->left);
int rightHeap = isHeap(t->right);

int isHeap = 1;

// check for each node that its value is greater than its children nodes
// for a min-heap, check for less than
if (t->left != NULL) {
if (t->value < t->left->value) {
isHeap = 0;
}
}
if (t->right != NULL) {
if (t->value < t->right->value) {
isHeap = 0;
}
}

// returns 1 if heap, 0 otherwise
return (isHeap && leftHeap && rightHeap);
}


## Hash Tables

Properties

• Lookup/search for a value given a key is often $O(1)$
• Lookup/search for a key in the hash table is $O(n)$

Seperate Chaining

• To prevent collisions, there are multiple items per array element (linked list)
• Time complexity:
• Best: $O(1)$
• Worst: $O(n)$

Linear Probing

• Prevents collisions by finding a new location
• If slot is in use
• Try next slot, then next … until a free slot is found
• Insert the item into that slot
• Time complexity:
• Dependent on how full the hash table is
• Average for successful search: $0.5 \times \bigg(1 + \dfrac{1}{1 - \alpha}\bigg)$ where $\alpha = \dfrac{full_{slots}}{all_{slots}}$
• Average for unsuccessful search: $0.5 \times \bigg(1 + \dfrac{1}{(1 - \alpha)^2}\bigg)$
• Deletion is complex

Double Hashing

• Improved on linear probing by using an increment which
• Is based on a secondary hash of the key
• Ensures that all elements are visited
• Tends to eliminate clusters, thus creating shorted probe paths
• Time complexity can be significantly better than linear probing even when the hash table is almost full
• Steps:
• Generate initial hash $h = hash()$
• Generate second hash $inc = hash2()$
• Iterate over hash table for (j = 0; j < n; j += inc)
• Generate index to insert into ix = (h + j) % n
• If can insert then insert, otherwise continue loop
function insert(HashTable ht, Item it):
n = size(ht)
key = key(it)

// generate initial hash and second hash
h = hash(key, n)
inc = hash2(key, hash2(ht))

// loop over hash table, incrementing by inc after each iteration
for j = 0; j < n; j += inc:
ix = (h + j) % n
if ht[ix] == Empty:
ht[ix] = it
return


## Text Processing Algorithms

### Pattern Matching Algorithms

Given two strings $T$ (text) and $P$ (pattern), the pattern matching problem consists of finding a substring in $T$ equal to $P$.

Brute Force

• Checks for each possible shift of $P$ relative to $T$ until a match is found, or all placements of the pattern have been tried
• Time complexity: $O(nm)$, where $n$ is the length of the text, $m$ is the length of the pattern

Boyer-Moore Algorithm

• Time complexity: $O(nm + s)$, where $s$ is the size of the alphabet
• Based off two heuristics:
• Looking glass heuristic: compare $P$ with subsequence of $T$ moving backwards
• Character jump heuristic: when a mistmatch occurs at $T[i] = c$
• If $P$ contains $c$, then shift $P$ to align the last occurrence of $c$ in $P$ with $T[i]$
• Otherwise, shift $P$ to align $P[0]$ with $T[i + 1]$ (a.k.a ‘big jump’)

• The algorithm preprocesses the pattern $P$ and alphabet $\Sigma$ to build a last occurence function
• This function maps the alphabet to array indexes, where the array index is the last occurance of the letter in $P$, or $-1$ if no index exists
• For example, $\Sigma = {a, b, c, d}$, $P =$ acab
c a b c d
L(c) 2 3 1 -1

Knuth-Morris-Pratt Algorithm

• Compares the pattern to the text left to right
• Time complexity: $O(m + n)$
• When a mismatch occurs at index $j$, if there is a prefix in $P[0..j]$ that is also a suffix of $P[1..j]$, then we can shift the patten by the largest prefix length

• The algorithm preprocesses the pattern to find matches of its prefixes with itself
• Failure function $F(j)$ is defined as the size of the largest prefix of $P[0..j]$ that is also a suffix of $P[1..j]$
• For example, $P =$abaaba
$j$ 0 1 2 3 4 5
$P_j$ a b a a b a
$F(j)$ 0 0 1 1 2 3

### Tries

• Used for storing a compact data structure to represent a set of strings
• Every node in a trie
• Contains one part of a key (usually a character)
• May have up to 26 children
• May be flagged as a ‘finishing’ node
• Time complexity of searching: $O(d)$, where $d$ is the depth
#define ALPHABET_SIZE 26
typedef char *Key;
typedef struct Node *Trie;
typedef struct Node {
bool finish;
char data;
Trie child[ALPHABET_SIZE];
} Node;


Compressed Tries

• Obtained from standard tries by compressing ‘redundant’ chains of nodes

### Text Compression

Huffman Coding

• Computes the frequency of each character
• Combines pairs of lowest frequency characters to build encoding tree ‘bottom up’
• Time complexity: $O(n + d \ log(d))$, where $n$ is the length of the input text and $d$ is the number of distinct characters in the input text