Written by Luka Kerr on November 11, 2018

## Intro

Programming languages are formal languages which allows us to:

• Design languages better and avoid undefined behaviour
• Make languages easier to process and parse
• Give mathematical meaning to programs allowing for formal verification
• Develop algorithms to find bugs automatically
• Rigorously analyse optimisations and other program transformations

### Stages of a Compiler

Lexer

• Given a string of source code, produces a stream of tokens or lexems removing irrelevant information (comments, whitespace)

Parser

• Converts the stream of tokens from the lexer into a parse tree or abstract syntax tree

Semantic Analysis

• Checks variable scoping
• Type checking
• Adds extra information to the tree

Optimiser

• Loop unrolling, inlining

Code Generation

• Selects appropriate machine instructions

### Backus-Naur Form

• Allows for specification of grammar (the structure of lexems expected to produce certain parse trees)

• Uses terminal and non-terminal symbols. Terminal symbols are the elementary symbols of the language, whereas non-terminal symbols are symbols that can be replaced

• Example

funDef ::= Ident1 Ident2 ( args ) stmt
stmt   ::= expr ;
| if ( expr ) stmt else stmt
| return expr ;
| while ( expr ) stmt
stmts  ::= epsilon
| stmt stmts
expr   ::= Number | Ident | expr1 + expr2
| Ident == expr
| Ident ( expr )
args   ::= epsilon | ...


This translates to English as:

• A function definition consists of an indentifier, another identifier, a left paren, arguments, a right paren and a statement
• A statement consists of an expression followed by a semicolon OR an if keyword, a left paren, an expression, a right paren, a statement, an else, a statement OR a return, an expression, a semicolon OR a while, left paren, expression, right paren and a statement
• Multiple statements consist of an empty string OR a statement followed by multiple statements
• An expression consists of a number OR an identifier OR an expression, a plus, another expression OR an identifier, two equals, an expression OR an identifier, a left paren, an expression and a right paren
• Args consists of an empty string OR something else

## Natural Deduction

### Formalisation

• The process of giving a language a formal mathematical description
• For programming languages, the language is defined using another language called the meta language. For formalisations, the formalisations are described using a minimal logic called the meta logic
• A common meta logic to use is based on natural deduction and inductive inference rules

### Judgements

• A statement asserting a certain property for an object

• To prove a judgement we can use inference rules, written as $$\dfrac{J_1 \quad J_2 \quad \dots{} \quad J_n}{J}$$

• The above inference rule states that in order to prove judgement $J$ (the conclusion), it suffices to prove all judgements $J_1$ through to $J_n$ (the premises).

• Rules with no premises are called axioms, and their conclusions always hold (are true).

• For example, the natural numbers $\mathbb{N}$ $$\dfrac{}{0 \ Nat} N_1 \quad \dfrac{n \ Nat}{(S \ n) \ Nat} N_2$$ The above inference rules state that $0$ is a natural number (axiom), and that if $n$ is a natural number, then $S \ n$ (the successor of $n$) is a natural number.

### Derivability

• If we can add an inference rule that are compositions of existing rules, then the new rule is called derivable
• If a rule can be added to a language and the language definition doesn’t change, then the rule is admissible to the language

### Hypothetical Derivations

• By writing rules in horizontal format ($\frac{A}{B} \equiv A \vdash B$), we can make rules premises of other rules
• For example $\frac{A \ \vdash\ B}{C}$ is read as: if assuming $A$ we can derive $B$, then we can derive $C$

## Rule Induction

The process of using existing inference rules to prove a property in the language. The assumptions are called inductive hypotheses.

In general, given a set of rules $R$, we may prove a property $P$ inductively for all judgements that can be inferred with $R$ by showing, for each rule of the form $\frac{J_1 \quad J_1 \quad \dots{} \quad J_n}{J}$ that if $P$ holds for each of $J_1 . . J_n$, then $P$ holds for $J$

The base cases of induction are axioms, all other rules are inductive cases and the premises of each rule give rise to inductive hypotheses

### Mathematical Induction Steps (revision)

• When doing induction there are a few steps to remember in the process
• State the proposition $P(n)$
• Show that the base case $P(\text{base case})$ is true
• State the inductive hypothesis that $P(n)$ is true
• Prove that $P(n) \implies P(n + 1)$, this is the inductive step
• Invoke induction by writing Q.E.D, or draw a $\Box$

### Ambiguity

• A language is ambiguous when there are multiple ways to derive the same judgement
• To remove ambiguity, we can specify associativity
• Left-Associativity $A \circ B \circ C = (A \circ B) \circ C$
• Right-Associativity $A \circ B \circ C = A \circ (B \circ C)$

## Syntax

Concrete syntax is described by judgements on strings, which describe the actual text input by the programmer. It is ambiguous and not suitable for compiler implementation and proofs

Abstract syntax is a representation of a program where extraneous information and ambiguity is removed. It is usually represented as a tree, known as an abstract syntax tree

## Parsing

• The process of converting concrete to abstract syntax is called parsing
• The parsing relation $\longleftrightarrow$ is unambiguous
• The process of parsing, or determining the abstract syntax corresponding to a particular concrete syntax is as follows
1. Derive the left hand side of the $\longleftrightarrow$ bottom up until reaching axioms
2. Fill in the right hand side of the $\longleftrightarrow$ top down, starting at the axioms
• The inverse of parsing known as pretty-printing is the process of starting with the abstract syntax on the right hand side of $\longleftrightarrow$ and attempting to synthesise a concrete syntax on the left. This is non-deterministic as there are many ways to convert abstract to concrete syntax

## Bindings

A program usually has variables, and where these variables are declared is known as the binding occurance of the variable. The scope of the variable is wherever the variable has a value and is defined. For example:

let x = 5 in     -- binding occurance of x
let y = 2 in   -- scope of x
x + y        -- usage occurance of x
end            -- end scope of x
end


The process of finding the binding occurance of each used variable is called scope resolution.

• Shadowing occurs when a variable is declared inside another variable’s scope, and both have the same name, for example:

let x = 5 in
let x = 2 in   -- x is shadowed here
x + x        -- x + x = 4
end
end


### $\alpha$-equivalence

• Expressions that are semantically identical but differ in bound variable names are $\alpha$-equivalent, for example:

let x = 5 in
let x = 2 in
x + x
end
end

-- the above expression is alpha equivalent to the one below

let a = 5 in
let y = 2 in
y + y
end
end


### Substitution

• A variable $x$ is free in an expression $e$ if $x$ occurs in $e$ but is not bound in $e$
• A substitution, written $e[x := t]$ is the replacement of all free occurances of $x$ in $e$ with the term $t$

### Variable Capture

• Capture can occur for a substitution $e[x := t]$ whenever there is a bound variable in the expression $e$ with the same name as a free variable occuring in $t$
• To avoid capture we can
• $\alpha$-rename the offending bound variable to an unused name, or
• If you have access to the free variable’s definition, rename the free variable, or
• Use De Bruijn indices

## First Order Abstract Syntax

### De Bruijn Indices

• De Bruijn indices allow us to address the issue with substitution capture by

• Removing all identifier from binding expressions (e.g. Let)
• Replace the identifier in a Var with a number indicating how many binders we must skip in order to find the binder for that variable
(Let "a" (Num 5)
(Let "y" (Num 2)
(Plus (Var "a") (Var "y"))))

-- the above expression can be represented using De Bruijn indices as below

(Let (Num 5)
(Let (Num 2)
(Plus (Var 1) (Var 0))))

• To convert a piece of first order abstract syntax to De Bruijn indices, we can keep a stack of variable names, pushing onto the stack at each Let and popping after the variable goes out of scope. When a usage is encountered, replace the variable name with its first position in the stack

## Higher Order Abstract Syntax

Higher order abstract syntax relies on the meta language to facilitate binding of variables. Because of this, many problems such as malformed syntax, capture and $\alpha$-equivalence is solved since it is handled by the meta language

For example, a term language $t$ can be defined as

t ::= Symbol    -- symbols
| x        -- variables
| t1 t2    -- application
| x. t     -- binding or abstraction


## $\lambda$-Calculus

Lambda calculus is a system using mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution

The term language $t$ above can be slightly changed:

t ::= Symbol    -- symbols
| x        -- variables
| t1 t2    -- application
| λx. t    -- lambda abstraction


### $\beta$-reduction

• Beta reduction is used to evaluate terms, for example the function $(\lambda x . t)$ below denotes a function that given an argument for $x$, will return $t$ $$(\lambda x . t) \ u \quad \mapsto_\beta \quad t[x := u]$$

• $\beta$-reduction is a congruence, meaning that we can pick any reducible subexpression (called a redex) and perform a $\beta$-reduction. For example $$\begin{array}{ll} (\lambda x . \lambda y . f (y \ x)) \ 5 \ (\lambda x . x) & \mapsto_\beta \quad (\lambda y . f \ (y \ 5)) \ (\lambda x . x) \\ & \mapsto_\beta \quad f \ ((\lambda x. x) \ 5) \\ & \mapsto_\beta \quad f \ 5 \end{array}$$

### $\eta$-conversion

• If we have a term $f$ and apply it to a variable, and then abstract over that same veriable, we end up with the original term
• For example $\lambda x . (f x) \mapsto_{\beta \eta} f$, or in Haskell, (\x -> abs x) is the same as abs

### Confluence

• Suppose we arrive via one reduction path to an expression that cannot be reduced further (known as normal form), then any other reduction path will result in the same normal form

### Equivalence

• Two terms are $\alpha \beta$-equivalent, written $s \equiv_{\alpha \beta}$ if they $\beta$-reduce to $\alpha$-equivalent normal forms

## Semantics

### Static Semantics

• Aspects of a program that can be determined by a compiler or external tool (linter) without running the program
• Two main aspects include types and scope

### Dynamic Semantics

• Can be specified in many ways:

• Denotational semantics is the compositional construction of a mathematical object for each form of syntax
• Axiomatic semantics is the construction of a proof calculus to allow correctness of a program to be verified
• Operational semantics is used in this course and is the construction of a program-evaluating state machine or transition system
• Two main aspects of dynamic semantics include behaviour and cost

• Operational semantics has two main kinds

• Small Step
• A judgement that specifies transitions between states: $e \mapsto e’$
• We need a set of states $\Sigma$, a set of initial states $I \sube \Sigma$, a set of final states $F \sube \Sigma$ and a relation $\mapsto \sube \Sigma \times \Sigma$ which specifies only “one step” of the execution
• For example, where $\Sigma$ and $I$ are the set of all closed expressions ${ e \ | \ e \ \bold{ok} }$ and $F$ is the set of evaluated expressions ${ (Num \ n) \ | \ n \in \mathbb{Z} }$ $$\dfrac{e_1 \mapsto e'_1}{(Plus \ e_1 \ e_2) \mapsto (Plus \ e'_1 \ e_2)} \\ \dfrac{e_2 \mapsto e'_2}{(Plus \ (Num \ n) \ e_2) \mapsto (Plus \ (Num \ n) \ e'_2)} \\ \dfrac{}{(Plus \ (Num \ n) \ (Num \ m)) \mapsto (Num \ (n + m))}$$
• Big Step
• One big judgement relating expressions to their values: $e \Downarrow v$
• We need a set of evaluable expressions $E$, a set of values $V$ and a relation $\Downarrow \sube E \times V$
• For example, where $E$ is the set of all closed expressions ${e \ | \ e \ \bold{ok} }$, and $V$ is the set of integers $\mathbb{Z}$ $$\begin{array}{cc} \dfrac{}{(Num \ n) \Downarrow n} & \dfrac{e_1 \Downarrow v_1 \quad e_2[x := (Num \ v_1)] \Downarrow v_2}{(Let \ e_1 (x. e_2)) \Downarrow v_2} \\ \\ \dfrac{e_1 \Downarrow v_1 \quad e_2 \Downarrow v_2}{(Plus \ e_1 \ e_2) \Downarrow (v_1 + v_2)} & \dfrac{e_1 \Downarrow v_1 \quad e_2 \Downarrow v_2}{(Times \ e_1 \Downarrow e_2) \Downarrow (v_1 \times v_2)} \end{array}$$

### Evaluation Strategies

• Call-by-value or strict evaluation, shown above, is where an expression is evaluated and bound to a corresponding variable
• Call-by-name is when an expression is replaced when needed in another expression
• Call-by-need or lazy evaluation, is when an expression is computed only if its value is needed

### Equivalence

• Small step semantics are lower level, they clearly specify the order of evaluation. Big step semantics gives us a result without telling us explicitly how it was computed
• Small and big step semantics can be proven to be equivalent, that is if there exists a trace $e \mapsto \dots{} \mapsto (Num \ n)$, then $e \Downarrow n$, and vice versa
• Let $\overset{*}{\mapsto}$ be the reflexive, transitive closure of $\mapsto$, then $$\begin{array}{ll} \dfrac{}{e \overset{*}{\mapsto} e} & \dfrac{e_1 \mapsto e_2 \ \ \ \ \ e_2 \overset{*}{\mapsto} e_n}{e_1 \overset{*}{\mapsto} e_n} \end{array}$$ Now we can formally state the property of equivalence as $e \overset{*}{\mapsto} (Num \ n) \iff e \Downarrow n$

## Functional Programming

A programming language derived from or inspired by the $\lambda$-calculus, or derived from or inspired by another functional programming language

### Purely Functional

• A programming language is purely functional if $\beta$-reduction is a confluence
• In other words, functions have to be mathematical functions, and free of side effects

## Cost Models

### Operational Cost Models

• First, we define a program-evaluating abstract machine. We can determine the time cost by counting the number of steps taken by the abstract machine
• One step in the abstract machine should always only be $\mathcal{O} (1)$

## Control Flow

To define an abstract machine where all the rules are axions, we use a stack.

### Stack

$\begin{array}{l} \dfrac{}{\circ \ \textbf{Stack}} \quad \dfrac{f \ \textbf{Frame} \quad s \ \textbf{Stack}}{f \triangleright s \ \textbf{Stack}} \end{array}$
• Initial states are those that start evaluating an expression from an empty stack, i.e. $\circ \succ e$
• Final states are those that return a value to the empty stask, i.e. $\circ \prec v$
• Stack frames are expressions with holes of values in them, i.e. $\text{Plus} \ v_1 \ \square$

#### Example

$\begin{array}{ll} & \circ \succ (\text{Plus (Plus (Num 2) (Num 3)) (Num 4)}) \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \succ \text{(Plus (Num 2) (Num 3))} \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 3))} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \succ \text{(Num 2)} \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 3))} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \prec 2 \\ \mapsto & \text{(Plus 2} \ \square \ \text{)} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \succ \text{(Num 3)} \\ \mapsto & \text{(Plus 2} \ \square \ \text{)} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \prec 3 \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \prec 5 \\ \mapsto & \text{(Plus 5} \ \square \ \text{)} \triangleright \circ \succ \text{(Num 4)} \\ \mapsto & \text{(Plus 5} \ \square \ \text{)} \triangleright \circ \prec 4 \\ \mapsto & \circ \prec 9 \\ \end{array}$

## Environments

• An environment is a context containing equality assumptions about variables
• It can be viewed like a state that maps variables to their values, except that a variable’s value does not change over time

## Safety and Liveness

### Safety

A safety property states that something bad does not happen. For example:

I will never run out of money

### Liveness

A liveness property states that something good will happen. For example

If I start saving money now, eventually I will be rich

## Type Safety

A property that states that well-typed programs do not go wrong

### Progress and Preservation

To prove that a well-typed program either goes on forever or reaches a final state, we use the following two lemmas

• Progress
• Well-typed states are not stuck states. That is, if an expression $e : \tau$ then either $e$ is a final state or there exists a state $e’$ such that $e \mapsto e’$
• Preservation
• Evaluating one step preserves types. That is, if an expression $e : \tau$ and $e \mapsto e’$, then $e’ : \tau$

For example:

$e_1 : \tau \mapsto \underbrace{e_2: \tau}_{\text{preservation}} \overbrace{\mapsto e_3 }^{\text{progress}} : \tau \mapsto \dots{} \\$

## Composite Data Types

### Unit Types - $()$

A type that has just one value - $\textbf{1}$

### Product Types - $\tau_1 \times \tau_2$

Data that is stored in one value as a pair. “$\tau_1$ and $\tau_2$”

$\dfrac{e : \tau_1 \times \tau_2}{\text{fst} \ e : \tau_1} \quad \dfrac{e : \tau_1 \times \tau_2}{\text{snd} \ e : \tau_2}$

### Sum Types - $\tau_1 + \tau_2$

Data that may be one of two forms. “$\tau_1$ or $\tau_2$”

$\dfrac{e : \tau_1}{\text{InL} \ e : \tau_1 + \tau_2} \quad \dfrac{e : \tau_2}{\text{InR} \ e : \tau_1 + \tau_2} \quad \dfrac{e : \tau_1 + \tau_2 \qquad x : \tau_1, e_1 : \tau \qquad y : \tau_2, e_2 : \tau}{\textbf{case} \text{ e } \textbf{of} \text{ InL x} \to e_1; \text{InR y} \to e_2}$

### Empty Type - $0$

A type that cannot be constructed, and will never be evaluated

### Recursive Types - $\textbf{rec} \ t. \tau$

A type that allows reference to the entire type recursively

$\dfrac{e : \tau [t := \textbf{rec} \ t. \tau]}{\text{roll} \ e : \textbf{rec} \ t. \tau} \quad \dfrac{e : \textbf{rec} \ t. \tau}{\text{unroll} \ e : \tau [t := \textbf{rec} \ t. \tau]}$

## Curry-Howard Correspondence

$\begin{array}{ll} \times & \land \\ \text{+} & \lor \\ \to & \implies \\ 1 & \top \\ 0 & \bot \\ \forall & \forall \end{array}$

## Polymorphism

A way to specify that we don’t care what data type is used (also known as generics)

### Type Abstraction

The ability to define functions regardless of specific types.

swap = type a. type b.
recfun swap :: (a, b) -> (b, a) x = (snd x, fst x)


### Type Application

The ability to instantiate polymorphic functions to specific types.

swap@Int@Bool (3, True)


### Generality

• A type $\tau$ is more general than a type $\rho$, often written $\rho \sqsubseteq \tau$, if type variables in $\tau$ can be instantiated to the given type $\rho$
• For example: $\text{Int} \to \text{Int} \sqsubseteq \forall z. z \to z \sqsubseteq \forall x \ y. x \to y \sqsubseteq \forall a. a$

### Implementation

#### Template Instantiation

Automatically generate a monomorphic copy of each polymorphic function based on the types applied to it.

For example for a type application like swap@Int@Bool

swap = type a. type b.
recfun swap :: (a, b) -> (b, a) x = (snd x, fst x)


would be replaced statically by

swap = recfun swap :: (Int, Bool) -> (Bool, Int) x = (snd x, fst x)

Little to no run-time cost Large binary size if many instantiations are used
Easy to implement Can lead to long compilation times

#### Boxing

All data types are represented as a pointer to a data structure on the heap, where all pointers are exactly 32 or 64 bits large.

## Parametricity

The principle of parametricity states that the result of polymorphic functions cannot depend on values of an abstracted type.

More formally, given a polymorphic function $g$ that takes a type parameter, if the function $f : \tau \to \tau$ was run on some values of type $\tau$, then run on $g\text{@} \tau$ on the result, that will give the same results as running $g\text{@} \tau$ first, then $f$.

### Examples

head :: [a] -> a
-- parametric example

concat :: [[a]] -> [a]
-- parametric example
map f (concat list) = concat (map (map f) list)


## Unification

A substitution $S$ to unification variables is a unifier of two types $\tau$ and $\rho$ if $S_\tau = S_\rho$.

Furthermore, it is the most general unifier of $\tau$ and $\rho$ if there is no other unifier $S’$ where $S_\tau \sqsubseteq S’ \tau$.

We write $\tau \stackrel{U}{\sim} \rho$ if $U$ is the most general unifier of $\tau$ and $\rho$.

### Examples

$\begin{array}{lll} 1) & \alpha \times (\alpha \times \alpha) \sim \beta \times \gamma & \alpha = \beta, \gamma = (\alpha \times \alpha) \\ & \alpha \times (\alpha \times \alpha) \sim \alpha \times (\alpha \times \alpha) \\ \\ 2) & (\alpha \times \alpha) \times \beta \sim \beta \times \gamma & \beta = (\alpha \times \alpha), \gamma = \beta \\ & (\alpha \times \alpha) \times (\alpha \times \alpha) \sim (\alpha \times \alpha) \times (\alpha \times \alpha) \\ \\ 3) & \text{Int} + \alpha \sim \alpha + \text{Bool} \\ & \text{TypeError} \\ \\ 4) & (\alpha \times \alpha) \times \alpha \sim \alpha \times (\alpha \times \alpha) \\ & \text{No Unifier} \\ \end{array}$

## Linear Types

$A \multimap B \quad$ $A$ can be transformed into $B$

$A \otimes B \quad$ You’ve got both $A$ and $B$

$A \oplus B \quad$ You’ve got either $A$ or $B$

$A \text{ \& } B \quad$ You can pick from $A$ or $B$

$!A \qquad$ You’ve got an unlimited amount of $A$

## Abstract Data Types

A type defined not by its internal representation, but by the operations that can be performed on it.

## Existential Types

$\forall a. \tau$ $\exists a . \tau$
When producing a value, $a$ is an arbitrary, unknown type. When consuming a value, $a$ may be instantiated to any desired type. When consuming a value, $a$ is an arbitrary, unknown type. When producing a value, $a$ may be instantiated to any desired type.

• Specified by three operations
• emptyBag - $\mathcal{B}$
• addToBag - $(\mathcal{B} \to \text{Int} \to \mathcal{B})$
• average - $(\mathcal{B} \times \text{Int})$
• The type for this ADT is $\exists \mathcal{B} . \mathcal{B} \times (\mathcal{B} \to \text{Int} \to \mathcal{B}) \times (\mathcal{B} \to \text{Int})$

### Type Classes

• A set of types for which implementations have been provided for various functions, called methods
• For example, in Haskell the types Int, Float etc are all instances of the type class Num which has methods such as +, negate etc
• We write $f :: \forall a . P \Rightarrow \tau$ to indicate that $f$ has the type $\tau$ where $a$ can be instantiated to any type under the condition that the constraint $P$ is satisfied. Typically, $P$ is a list of instance constrains, such as Num a or Eq b
• For example, $(+) :: \forall a . (\text{Num} \ a) \Rightarrow a \to a \to a$

### Dictionaries and Resolution

• The type checker will convert ad-hoc polymorphism (type classes) to parametric polymorphism

• Type classes are converted to types

class Eq a where
(==) : a -> a -> Bool
(/=) : a -> a -> Bool

-- becomes

type EqDict a = (a -> a -> Bool, a -> a -> Bool)

• Instances become values of the dictionary type

instance Eq Bool where
True  == True  = True
False == False = False
_     == _     = False
a     /= b     = not (a == b)

-- becomes

True ==(Bool) True  = True
True ==(Bool) False = True
_    ==(Bool) _     = False
a    /=(Bool) b     = not (a ==(Bool) b)
eqBoolDict = ((==(Bool)), (/=(Bool)))


## Subtyping

To add subtyping to a language, we define a partial order on types $\tau \le \rho$ and a rule of subsumption

$\dfrac{e : \tau \qquad \tau \le \rho}{e : \rho}$

What this partial order means is up to the language. There are two main approaches

• Most common: where upcasts do not have dynamic behaviour
• Uncommon: where upcasts cause a coercion to occur, actually converting the value from $\tau$ to $\rho$ at runtime

### Product Types

$\dfrac{\tau_1 \le \rho_1 \qquad \tau_2 \le \rho_2}{(\tau_1 \times \tau_2) \le (\rho_1 \times \rho_2)}$

Example: $(\text{Int} \times \text{Int}) \le (\text{Float} \times \text{Int})$

### Sum Types

$\dfrac{\tau_1 \le \rho_1 \qquad \tau_2 \le \rho_2}{(\tau_1 + \tau_2) \le (\rho_1 + \rho_2)}$

Example: $(\text{Int} + \text{Float}) \le (\text{Float} + \text{Float})$

### Functions

$\dfrac{\rho_1 \le \tau_1 \qquad \tau_2 \le \rho_2}{(\tau_1 \to \tau_2) \le (\rho_1 \to \rho_2)}$

Example: $(\text{Float} \to \text{Int}) \le (\text{Int} \to \text{Float})$

### Variance

• The way a type constructor ($+, \times, \to$) interacts with subtyping is called its variance
• For a type constructor $C$ and $\tau \le \rho$
• If $C \tau \le C \rho$ then $C$ is covariant
• If $C \rho \le C \tau$ then $C$ is contravariant
• If it is neither covariant not contravariant, then it is called invariant

## Parallelism

Parallelism is the simultaneous execution of code on multiple processors or cores for the purposes of improved performance. E.g. simulations, neural networks.

Concurrency is an abstraction for the programmer, allowing programs to be structured as multiple threads of control, called processes. E.g. servers, GUI applications, kernels.

### Types

• Data Parallelism
• Also known as single instruction, multiple data (SIMD)
• The same code is executed on multiple processing units, working on different parts of a large data set
• Examples include GPUs
• Control Parallelism
• Also known as task parallelism or multiple instruction, multiple data (MIMD)
• Different programs are executed on different processing units
• Examples include CPUs

### Cost Models

• Work cost is the overall number of steps required to execute the program. It corresponds to time cost on a processor with just one core

• Depth cost is the minimal number of parallel steps required to execute the program. It corresponds to the time cost on a processor with unlimited cores

• $\llbracket e \rrbracket_\mathcal{W}$ denotes work cost and $\llbracket e \rrbracket_\mathcal{D}$ denotes depth cost for an expression $e$

• For sequential operations the two measures coincide. For example:
• $\llbracket \text{len} \ e \rrbracket_\mathcal{W} = \llbracket e \rrbracket_\mathcal{W} + 1$
• $\llbracket \text{len} \ e \rrbracket_\mathcal{D} = \llbracket e \rrbracket_\mathcal{D} + 1$
• For parallel operations, we get different results for each measure. For example:
• $\llbracket e_r | x_1 \leftarrow e_1 | \dots{} | x_n \leftarrow e_n \rrbracket_\mathcal{W} = \llbracket e_r \rrbracket_\mathcal{W} + \sum_i \llbracket e_i \rrbracket_\mathcal{W}$
• $\llbracket e_r | x_1 \leftarrow e_1 | \dots{} | x_n \leftarrow e_n \rrbracket_\mathcal{D} = \llbracket e_r \rrbracket_\mathcal{D} + \text{max}_i \llbracket e_i \rrbracket_\mathcal{D}$

## Critical Sections

A critical section is a group of multiple steps, grouped into one single atomic step.

We want to ensure two main properties:

• Mutual exclusion: no two processes are in their critical section at the same time
• Eventual entry: once it enters its pre-protocol, a process will eventually be able to execute its critical section

### Locks

A lock is a common mean of concurrency control. It is typically abstracted into two operations

• Taking the lock
• Releasing the lock

## Software Transactional Memory

Each transaction is executed optimistically, rather than pessimistically (i.e. locks).

Each transaction has an associated log, which contains

• The values written to any TVars with writeTVar
• The values read from any TVars with readTVar, consulting earlier log entries first

First the log is validated, and, if validation succeeds, changes are comitted. Validation and commit are one atomic step. If validation fails, the transaction is re-run.

### Example

withdraw :: Account -> Int -> STM ()
withdraw acc amt = do
if amt > 0 && amt > balance then
retry
else
writeTVar acc (balance - amt)


### Progress

• Traditional deadlock scenarios are impossible, as is cyclic restarting where two transactions constantly cancel each other out
• Starvation is possible, however uncommon in practice

### Database Guarantees

• Atomicity: each transaction should be ‘all or nothing’
• Consistency: each transaction in the future sees the effects of transactions in the past
• Isolation: the transaction’s effect on the state cannot be affected by other transactions
• Durability: the transaction’s effect on the state survives power outages and crashes