COMP3161 Notes

Written by Luka Kerr on November 11, 2018

Intro

Programming languages are formal languages which allows us to:

Design languages better and avoid undefined behaviour
Make languages easier to process and parse
Give mathematical meaning to programs allowing for formal verification
Develop algorithms to find bugs automatically
Rigorously analyse optimisations and other program transformations

Stages of a Compiler

Lexer

Given a string of source code, produces a stream of tokens or lexems removing irrelevant information (comments, whitespace)

Parser

Converts the stream of tokens from the lexer into a parse tree or abstract syntax tree

Semantic Analysis

Checks variable scoping
Type checking
Adds extra information to the tree

Optimiser

Loop unrolling, inlining

Code Generation

Links runtime system
Selects appropriate machine instructions

Backus-Naur Form

Allows for specification of grammar (the structure of lexems expected to produce certain parse trees)
Uses terminal and non-terminal symbols. Terminal symbols are the elementary symbols of the language, whereas non-terminal symbols are symbols that can be replaced
Example
```
funDef ::= Ident1 Ident2 ( args ) stmt
stmt   ::= expr ;
           | if ( expr ) stmt else stmt
           | return expr ;
           | while ( expr ) stmt
stmts  ::= epsilon
           | stmt stmts
expr   ::= Number | Ident | expr1 + expr2
           | Ident == expr
           | Ident ( expr )
args   ::= epsilon | ...
```
This translates to English as:
- A function definition consists of an indentifier, another identifier, a left paren, arguments, a right paren and a statement
- A statement consists of an expression followed by a semicolon OR an if keyword, a left paren, an expression, a right paren, a statement, an else, a statement OR a return, an expression, a semicolon OR a while, left paren, expression, right paren and a statement
- Multiple statements consist of an empty string OR a statement followed by multiple statements
- An expression consists of a number OR an identifier OR an expression, a plus, another expression OR an identifier, two equals, an expression OR an identifier, a left paren, an expression and a right paren
- Args consists of an empty string OR something else

Natural Deduction

Formalisation

The process of giving a language a formal mathematical description
For programming languages, the language is defined using another language called the meta language. For formalisations, the formalisations are described using a minimal logic called the meta logic
A common meta logic to use is based on natural deduction and inductive inference rules

Judgements

A statement asserting a certain property for an object
To prove a judgement we can use inference rules, written as $\dfrac{J_1 \quad J_2 \quad \dots{} \quad J_n}{J}$
The above inference rule states that in order to prove judgement $J$ (the conclusion), it suffices to prove all judgements $J_1$ through to $J_n$ (the premises).
Rules with no premises are called axioms, and their conclusions always hold (are true).
For example, the natural numbers $\mathbb{N}$ $\dfrac{}{0 \ Nat} N_1 \quad \dfrac{n \ Nat}{(S \ n) \ Nat} N_2$ The above inference rules state that $0$ is a natural number (axiom), and that if $n$ is a natural number, then $S \ n$ (the successor of $n$) is a natural number.

Derivability

If we can add an inference rule that are compositions of existing rules, then the new rule is called derivable
If a rule can be added to a language and the language definition doesn’t change, then the rule is admissible to the language

Hypothetical Derivations

By writing rules in horizontal format ($\frac{A}{B} \equiv A \vdash B $), we can make rules premises of other rules
For example $\frac{A \ \vdash\ B}{C}$ is read as: if assuming $A$ we can derive $B$, then we can derive $C$

Rule Induction

The process of using existing inference rules to prove a property in the language. The assumptions are called inductive hypotheses.

In general, given a set of rules $R$, we may prove a property $P$ inductively for all judgements that can be inferred with $R$ by showing, for each rule of the form $\frac{J_1 \quad J_1 \quad \dots{} \quad J_n}{J}$ that if $P$ holds for each of $J_1 . . J_n$, then $P$ holds for $J$

The base cases of induction are axioms, all other rules are inductive cases and the premises of each rule give rise to inductive hypotheses

Mathematical Induction Steps (revision)

When doing induction there are a few steps to remember in the process
- State the proposition $P(n)$
- Show that the base case $P(\text{base case})$ is true
- State the inductive hypothesis that $P(n)$ is true
- Prove that $P(n) \implies P(n + 1)$, this is the inductive step
- Invoke induction by writing Q.E.D, or draw a $\Box$

Ambiguity

A language is ambiguous when there are multiple ways to derive the same judgement
To remove ambiguity, we can specify associativity
- Left-Associativity $A \circ B \circ C = (A \circ B) \circ C$
- Right-Associativity $A \circ B \circ C = A \circ (B \circ C)$

Syntax

Concrete syntax is described by judgements on strings, which describe the actual text input by the programmer. It is ambiguous and not suitable for compiler implementation and proofs

Abstract syntax is a representation of a program where extraneous information and ambiguity is removed. It is usually represented as a tree, known as an abstract syntax tree

Parsing

The process of converting concrete to abstract syntax is called parsing
The parsing relation $\longleftrightarrow$ is unambiguous
The process of parsing, or determining the abstract syntax corresponding to a particular concrete syntax is as follows
1. Derive the left hand side of the $\longleftrightarrow$ bottom up until reaching axioms
2. Fill in the right hand side of the $\longleftrightarrow$ top down, starting at the axioms
The inverse of parsing known as pretty-printing is the process of starting with the abstract syntax on the right hand side of $\longleftrightarrow$ and attempting to synthesise a concrete syntax on the left. This is non-deterministic as there are many ways to convert abstract to concrete syntax

Bindings

A program usually has variables, and where these variables are declared is known as the binding occurance of the variable. The scope of the variable is wherever the variable has a value and is defined. For example:

let x = 5 in     -- binding occurance of x
  let y = 2 in   -- scope of x
    x + y        -- usage occurance of x
  end            -- end scope of x
end

The process of finding the binding occurance of each used variable is called scope resolution.

Shadowing

Shadowing occurs when a variable is declared inside another variable’s scope, and both have the same name, for example:

let x = 5 in
  let x = 2 in   -- x is shadowed here
    x + x        -- x + x = 4
  end
end

$\alpha$-equivalence

Expressions that are semantically identical but differ in bound variable names are $\alpha$-equivalent, for example:

let x = 5 in
  let x = 2 in
    x + x
  end
end

-- the above expression is alpha equivalent to the one below

let a = 5 in
  let y = 2 in
    y + y
  end
end

Substitution

A variable $x$ is free in an expression $e$ if $x$ occurs in $e$ but is not bound in $e$
A substitution, written $e[x := t]$ is the replacement of all free occurances of $x$ in $e$ with the term $t$

Variable Capture

Capture can occur for a substitution $e[x := t]$ whenever there is a bound variable in the expression $e$ with the same name as a free variable occuring in $t$
To avoid capture we can
- $\alpha$-rename the offending bound variable to an unused name, or
- If you have access to the free variable’s definition, rename the free variable, or
- Use De Bruijn indices

First Order Abstract Syntax

De Bruijn Indices

De Bruijn indices allow us to address the issue with substitution capture by
- Removing all identifier from binding expressions (e.g. Let)
- Replace the identifier in a Var with a number indicating how many binders we must skip in order to find the binder for that variable
```
(Let "a" (Num 5)
  (Let "y" (Num 2)
    (Plus (Var "a") (Var "y"))))

-- the above expression can be represented using De Bruijn indices as below

(Let (Num 5)
  (Let (Num 2)
    (Plus (Var 1) (Var 0))))
```
To convert a piece of first order abstract syntax to De Bruijn indices, we can keep a stack of variable names, pushing onto the stack at each Let and popping after the variable goes out of scope. When a usage is encountered, replace the variable name with its first position in the stack

Higher Order Abstract Syntax

Higher order abstract syntax relies on the meta language to facilitate binding of variables. Because of this, many problems such as malformed syntax, capture and $\alpha$-equivalence is solved since it is handled by the meta language

For example, a term language $t$ can be defined as

t ::= Symbol    -- symbols
     | x        -- variables
     | t1 t2    -- application
     | x. t     -- binding or abstraction

$\lambda$-Calculus

Lambda calculus is a system using mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution

The term language $t$ above can be slightly changed:

t ::= Symbol    -- symbols
     | x        -- variables
     | t1 t2    -- application
     | λx. t    -- lambda abstraction

$\beta$-reduction

Beta reduction is used to evaluate terms, for example the function $(\lambda x . t)$ below denotes a function that given an argument for $x$, will return $t$ $(\lambda x . t) \ u \quad \mapsto_\beta \quad t[x := u]$
$\beta$-reduction is a congruence, meaning that we can pick any reducible subexpression (called a redex) and perform a $\beta$-reduction. For example $\begin{array}{ll} (\lambda x . \lambda y . f (y \ x)) \ 5 \ (\lambda x . x) & \mapsto_\beta \quad (\lambda y . f \ (y \ 5)) \ (\lambda x . x) \\ & \mapsto_\beta \quad f \ ((\lambda x. x) \ 5) \\ & \mapsto_\beta \quad f \ 5 \end{array}$

$\eta$-conversion

If we have a term $f$ and apply it to a variable, and then abstract over that same veriable, we end up with the original term
For example $\lambda x . (f x) \mapsto_{\beta \eta} f$, or in Haskell, (\x -> abs x) is the same as abs

Confluence

Suppose we arrive via one reduction path to an expression that cannot be reduced further (known as normal form), then any other reduction path will result in the same normal form

Equivalence

Two terms are $\alpha \beta$-equivalent, written $s \equiv_{\alpha \beta}$ if they $\beta$-reduce to $\alpha$-equivalent normal forms

Semantics

Static Semantics

Aspects of a program that can be determined by a compiler or external tool (linter) without running the program
Two main aspects include types and scope

Dynamic Semantics

Can be specified in many ways:
- Denotational semantics is the compositional construction of a mathematical object for each form of syntax
- Axiomatic semantics is the construction of a proof calculus to allow correctness of a program to be verified
- Operational semantics is used in this course and is the construction of a program-evaluating state machine or transition system
Two main aspects of dynamic semantics include behaviour and cost
Operational semantics has two main kinds
- Small Step
  - A judgement that specifies transitions between states: $e \mapsto e’$
  - We need a set of states $\Sigma$, a set of initial states $I \sube \Sigma$, a set of final states $F \sube \Sigma$ and a relation $\mapsto \sube \Sigma \times \Sigma$ which specifies only “one step” of the execution
  - For example, where $\Sigma$ and $I$ are the set of all closed expressions ${ e \ | \ e \ \bold{ok} }$ and $F$ is the set of evaluated expressions ${ (Num \ n) \ | \ n \in \mathbb{Z} }$ $\dfrac{e_1 \mapsto e'_1}{(Plus \ e_1 \ e_2) \mapsto (Plus \ e'_1 \ e_2)} \\ \dfrac{e_2 \mapsto e'_2}{(Plus \ (Num \ n) \ e_2) \mapsto (Plus \ (Num \ n) \ e'_2)} \\ \dfrac{}{(Plus \ (Num \ n) \ (Num \ m)) \mapsto (Num \ (n + m))}$
- Big Step
  - One big judgement relating expressions to their values: $e \Downarrow v$
  - We need a set of evaluable expressions $E$, a set of values $V$ and a relation $\Downarrow \sube E \times V$
  - For example, where $E$ is the set of all closed expressions ${e \ | \ e \ \bold{ok} }$, and $V$ is the set of integers $\mathbb{Z}$ $\begin{array}{cc} \dfrac{}{(Num \ n) \Downarrow n} & \dfrac{e_1 \Downarrow v_1 \quad e_2[x := (Num \ v_1)] \Downarrow v_2}{(Let \ e_1 (x. e_2)) \Downarrow v_2} \\ \\ \dfrac{e_1 \Downarrow v_1 \quad e_2 \Downarrow v_2}{(Plus \ e_1 \ e_2) \Downarrow (v_1 + v_2)} & \dfrac{e_1 \Downarrow v_1 \quad e_2 \Downarrow v_2}{(Times \ e_1 \Downarrow e_2) \Downarrow (v_1 \times v_2)} \end{array}$

Evaluation Strategies

Call-by-value or strict evaluation, shown above, is where an expression is evaluated and bound to a corresponding variable
Call-by-name is when an expression is replaced when needed in another expression
Call-by-need or lazy evaluation, is when an expression is computed only if its value is needed

Equivalence

Small step semantics are lower level, they clearly specify the order of evaluation. Big step semantics gives us a result without telling us explicitly how it was computed
Small and big step semantics can be proven to be equivalent, that is if there exists a trace $e \mapsto \dots{} \mapsto (Num \ n)$, then $e \Downarrow n$, and vice versa
Let $\overset{*}{\mapsto}$ be the reflexive, transitive closure of $\mapsto$, then $\begin{array}{ll} \dfrac{}{e \overset{*}{\mapsto} e} & \dfrac{e_1 \mapsto e_2 \ \ \ \ \ e_2 \overset{*}{\mapsto} e_n}{e_1 \overset{*}{\mapsto} e_n} \end{array}$ Now we can formally state the property of equivalence as $e \overset{*}{\mapsto} (Num \ n) \iff e \Downarrow n$

Functional Programming

A programming language derived from or inspired by the $\lambda$-calculus, or derived from or inspired by another functional programming language

Purely Functional

A programming language is purely functional if $\beta$-reduction is a confluence
In other words, functions have to be mathematical functions, and free of side effects

Cost Models

Operational Cost Models

First, we define a program-evaluating abstract machine. We can determine the time cost by counting the number of steps taken by the abstract machine
One step in the abstract machine should always only be $\mathcal{O} (1)$

Control Flow

To define an abstract machine where all the rules are axions, we use a stack.

Stack

\[\begin{array}{l} \dfrac{}{\circ \ \textbf{Stack}} \quad \dfrac{f \ \textbf{Frame} \quad s \ \textbf{Stack}}{f \triangleright s \ \textbf{Stack}} \end{array}\]

Initial states are those that start evaluating an expression from an empty stack, i.e. $\circ \succ e$
Final states are those that return a value to the empty stask, i.e. $\circ \prec v$
Stack frames are expressions with holes of values in them, i.e. $\text{Plus} \ v_1 \ \square$

Example

\[\begin{array}{ll} & \circ \succ (\text{Plus (Plus (Num 2) (Num 3)) (Num 4)}) \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \succ \text{(Plus (Num 2) (Num 3))} \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 3))} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \succ \text{(Num 2)} \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 3))} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \prec 2 \\ \mapsto & \text{(Plus 2} \ \square \ \text{)} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \succ \text{(Num 3)} \\ \mapsto & \text{(Plus 2} \ \square \ \text{)} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \prec 3 \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \prec 5 \\ \mapsto & \text{(Plus 5} \ \square \ \text{)} \triangleright \circ \succ \text{(Num 4)} \\ \mapsto & \text{(Plus 5} \ \square \ \text{)} \triangleright \circ \prec 4 \\ \mapsto & \circ \prec 9 \\ \end{array}\]

Environments

An environment is a context containing equality assumptions about variables
It can be viewed like a state that maps variables to their values, except that a variable’s value does not change over time

Safety and Liveness

Safety

A safety property states that something bad does not happen. For example:

I will never run out of money

Liveness

A liveness property states that something good will happen. For example

If I start saving money now, eventually I will be rich

Type Safety

A property that states that well-typed programs do not go wrong

Progress and Preservation

To prove that a well-typed program either goes on forever or reaches a final state, we use the following two lemmas

Progress
- Well-typed states are not stuck states. That is, if an expression $e : \tau$ then either $e$ is a final state or there exists a state $e’$ such that $e \mapsto e’$
Preservation
- Evaluating one step preserves types. That is, if an expression $e : \tau$ and $e \mapsto e’$, then $e’ : \tau$

For example:

\[e_1 : \tau \mapsto \underbrace{e_2: \tau}_{\text{preservation}} \overbrace{\mapsto e_3 }^{\text{progress}} : \tau \mapsto \dots{} \\\]

Composite Data Types

Unit Types - $()$

A type that has just one value - $\textbf{1}$

Product Types - $\tau_1 \times \tau_2$

Data that is stored in one value as a pair. “$\tau_1$ and $\tau_2$”

\[\dfrac{e : \tau_1 \times \tau_2}{\text{fst} \ e : \tau_1} \quad \dfrac{e : \tau_1 \times \tau_2}{\text{snd} \ e : \tau_2}\]

Sum Types - $\tau_1 + \tau_2$

Data that may be one of two forms. “$\tau_1$ or $\tau_2$”

\[\dfrac{e : \tau_1}{\text{InL} \ e : \tau_1 + \tau_2} \quad \dfrac{e : \tau_2}{\text{InR} \ e : \tau_1 + \tau_2} \quad \dfrac{e : \tau_1 + \tau_2 \qquad x : \tau_1, e_1 : \tau \qquad y : \tau_2, e_2 : \tau}{\textbf{case} \text{ e } \textbf{of} \text{ InL x} \to e_1; \text{InR y} \to e_2}\]

Empty Type - $0$

A type that cannot be constructed, and will never be evaluated

Recursive Types - $\textbf{rec} \ t. \tau$

A type that allows reference to the entire type recursively

\[\dfrac{e : \tau [t := \textbf{rec} \ t. \tau]}{\text{roll} \ e : \textbf{rec} \ t. \tau} \quad \dfrac{e : \textbf{rec} \ t. \tau}{\text{unroll} \ e : \tau [t := \textbf{rec} \ t. \tau]}\]

Curry-Howard Correspondence

\[\begin{array}{ll} \times & \land \\ \text{+} & \lor \\ \to & \implies \\ 1 & \top \\ 0 & \bot \\ \forall & \forall \end{array}\]

Polymorphism

A way to specify that we don’t care what data type is used (also known as generics)

Type Abstraction

The ability to define functions regardless of specific types.

swap = type a. type b.
         recfun swap :: (a, b) -> (b, a) x = (snd x, fst x)

Type Application

The ability to instantiate polymorphic functions to specific types.

swap@Int@Bool (3, True)

Generality

A type $\tau$ is more general than a type $\rho$, often written $\rho \sqsubseteq \tau$, if type variables in $\tau$ can be instantiated to the given type $\rho$
For example: $\text{Int} \to \text{Int} \sqsubseteq \forall z. z \to z \sqsubseteq \forall x \ y. x \to y \sqsubseteq \forall a. a$

Implementation

Template Instantiation

Automatically generate a monomorphic copy of each polymorphic function based on the types applied to it.

For example for a type application like swap@Int@Bool

swap = type a. type b.
         recfun swap :: (a, b) -> (b, a) x = (snd x, fst x)

would be replaced statically by

swap = recfun swap :: (Int, Bool) -> (Bool, Int) x = (snd x, fst x)

Advantages	Disadvantages
Little to no run-time cost	Large binary size if many instantiations are used
Easy to implement	Can lead to long compilation times

Boxing

All data types are represented as a pointer to a data structure on the heap, where all pointers are exactly 32 or 64 bits large.

Parametricity

The principle of parametricity states that the result of polymorphic functions cannot depend on values of an abstracted type.

More formally, given a polymorphic function $g$ that takes a type parameter, if the function $f : \tau \to \tau$ was run on some values of type $\tau$, then run on $g\text{@} \tau$ on the result, that will give the same results as running $g\text{@} \tau$ first, then $f$.

Examples

head :: [a] -> a
-- parametric example
f (head list) = head (map f list)

concat :: [[a]] -> [a]
-- parametric example
map f (concat list) = concat (map (map f) list)

Unification

A substitution $S$ to unification variables is a unifier of two types $\tau$ and $\rho$ if $S_\tau = S_\rho$.

Furthermore, it is the most general unifier of $\tau$ and $\rho$ if there is no other unifier $S’$ where $S_\tau \sqsubseteq S’ \tau$.

We write $\tau \stackrel{U}{\sim} \rho$ if $U$ is the most general unifier of $\tau$ and $\rho$.

Examples

\[\begin{array}{lll} 1) & \alpha \times (\alpha \times \alpha) \sim \beta \times \gamma & \alpha = \beta, \gamma = (\alpha \times \alpha) \\ & \alpha \times (\alpha \times \alpha) \sim \alpha \times (\alpha \times \alpha) \\ \\ 2) & (\alpha \times \alpha) \times \beta \sim \beta \times \gamma & \beta = (\alpha \times \alpha), \gamma = \beta \\ & (\alpha \times \alpha) \times (\alpha \times \alpha) \sim (\alpha \times \alpha) \times (\alpha \times \alpha) \\ \\ 3) & \text{Int} + \alpha \sim \alpha + \text{Bool} \\ & \text{TypeError} \\ \\ 4) & (\alpha \times \alpha) \times \alpha \sim \alpha \times (\alpha \times \alpha) \\ & \text{No Unifier} \\ \end{array}\]

Linear Types

$A \multimap B \quad$ $A$ can be transformed into $B$

$A \otimes B \quad$ You’ve got both $A$ and $B$

$A \oplus B \quad$ You’ve got either $A$ or $B$

$A \text{ \& } B \quad$ You can pick from $A$ or $B$

$!A \qquad$ You’ve got an unlimited amount of $A$

Abstract Data Types

A type defined not by its internal representation, but by the operations that can be performed on it.

Existential Types

$\forall a. \tau$	$\exists a . \tau$
When producing a value, $a$ is an arbitrary, unknown type. When consuming a value, $a$ may be instantiated to any desired type.	When consuming a value, $a$ is an arbitrary, unknown type. When producing a value, $a$ may be instantiated to any desired type.

Bag ADT Example

Specified by three operations
- emptyBag - $\mathcal{B}$
- addToBag - $(\mathcal{B} \to \text{Int} \to \mathcal{B})$
- average - $(\mathcal{B} \times \text{Int})$
The type for this ADT is $\exists \mathcal{B} . \mathcal{B} \times (\mathcal{B} \to \text{Int} \to \mathcal{B}) \times (\mathcal{B} \to \text{Int})$

Overloading

Type Classes

A set of types for which implementations have been provided for various functions, called methods
For example, in Haskell the types Int, Float etc are all instances of the type class Num which has methods such as +, negate etc
We write $f :: \forall a . P \Rightarrow \tau$ to indicate that $f$ has the type $\tau$ where $a$ can be instantiated to any type under the condition that the constraint $P$ is satisfied. Typically, $P$ is a list of instance constrains, such as Num a or Eq b
- For example, $(+) :: \forall a . (\text{Num} \ a) \Rightarrow a \to a \to a$

Dictionaries and Resolution

The type checker will convert ad-hoc polymorphism (type classes) to parametric polymorphism

Type classes are converted to types

class Eq a where
  (==) : a -> a -> Bool
  (/=) : a -> a -> Bool

-- becomes

type EqDict a = (a -> a -> Bool, a -> a -> Bool)

Instances become values of the dictionary type

instance Eq Bool where
  True  == True  = True
  False == False = False
  _     == _     = False
  a     /= b     = not (a == b)

-- becomes

True ==(Bool) True  = True
True ==(Bool) False = True
_    ==(Bool) _     = False
a    /=(Bool) b     = not (a ==(Bool) b)
eqBoolDict = ((==(Bool)), (/=(Bool)))

Programs that rely on overloading now take dictionaries as parameters

Subtyping

To add subtyping to a language, we define a partial order on types $\tau \le \rho$ and a rule of subsumption

\[\dfrac{e : \tau \qquad \tau \le \rho}{e : \rho}\]

What this partial order means is up to the language. There are two main approaches

Most common: where upcasts do not have dynamic behaviour
Uncommon: where upcasts cause a coercion to occur, actually converting the value from $\tau$ to $\rho$ at runtime

Product Types

$\dfrac{\tau_1 \le \rho_1 \qquad \tau_2 \le \rho_2}{(\tau_1 \times \tau_2) \le (\rho_1 \times \rho_2)}$

Example: $(\text{Int} \times \text{Int}) \le (\text{Float} \times \text{Int})$

Sum Types

$\dfrac{\tau_1 \le \rho_1 \qquad \tau_2 \le \rho_2}{(\tau_1 + \tau_2) \le (\rho_1 + \rho_2)}$

Example: $(\text{Int} + \text{Float}) \le (\text{Float} + \text{Float})$

Functions

$\dfrac{\rho_1 \le \tau_1 \qquad \tau_2 \le \rho_2}{(\tau_1 \to \tau_2) \le (\rho_1 \to \rho_2)}$

Example: $(\text{Float} \to \text{Int}) \le (\text{Int} \to \text{Float})$

Variance

The way a type constructor ($+, \times, \to$) interacts with subtyping is called its variance
For a type constructor $C$ and $\tau \le \rho$
- If $C \tau \le C \rho$ then $C$ is covariant
- If $C \rho \le C \tau$ then $C$ is contravariant
- If it is neither covariant not contravariant, then it is called invariant

Parallelism

Parallelism is the simultaneous execution of code on multiple processors or cores for the purposes of improved performance. E.g. simulations, neural networks.

Concurrency is an abstraction for the programmer, allowing programs to be structured as multiple threads of control, called processes. E.g. servers, GUI applications, kernels.

Types

Data Parallelism
- Also known as single instruction, multiple data (SIMD)
- The same code is executed on multiple processing units, working on different parts of a large data set
- Examples include GPUs
Control Parallelism
- Also known as task parallelism or multiple instruction, multiple data (MIMD)
- Different programs are executed on different processing units
- Examples include CPUs

Cost Models

Work cost is the overall number of steps required to execute the program. It corresponds to time cost on a processor with just one core
Depth cost is the minimal number of parallel steps required to execute the program. It corresponds to the time cost on a processor with unlimited cores
$\llbracket e \rrbracket_\mathcal{W}$ denotes work cost and $\llbracket e \rrbracket_\mathcal{D}$ denotes depth cost for an expression $e$
- For sequential operations the two measures coincide. For example:
  - $\llbracket \text{len} \ e \rrbracket_\mathcal{W} = \llbracket e \rrbracket_\mathcal{W} + 1$
  - $\llbracket \text{len} \ e \rrbracket_\mathcal{D} = \llbracket e \rrbracket_\mathcal{D} + 1$
- For parallel operations, we get different results for each measure. For example:
  - $\llbracket e_r | x_1 \leftarrow e_1 | \dots{} | x_n \leftarrow e_n \rrbracket_\mathcal{W} = \llbracket e_r \rrbracket_\mathcal{W} + \sum_i \llbracket e_i \rrbracket_\mathcal{W}$
  - $\llbracket e_r | x_1 \leftarrow e_1 | \dots{} | x_n \leftarrow e_n \rrbracket_\mathcal{D} = \llbracket e_r \rrbracket_\mathcal{D} + \text{max}_i \llbracket e_i \rrbracket_\mathcal{D}$

Critical Sections

A critical section is a group of multiple steps, grouped into one single atomic step.

We want to ensure two main properties:

Mutual exclusion: no two processes are in their critical section at the same time
Eventual entry: once it enters its pre-protocol, a process will eventually be able to execute its critical section

Locks

A lock is a common mean of concurrency control. It is typically abstracted into two operations

Taking the lock
Releasing the lock

Software Transactional Memory

Each transaction is executed optimistically, rather than pessimistically (i.e. locks).

Each transaction has an associated log, which contains

The values written to any TVars with writeTVar
The values read from any TVars with readTVar, consulting earlier log entries first

First the log is validated, and, if validation succeeds, changes are comitted. Validation and commit are one atomic step. If validation fails, the transaction is re-run.

Example

withdraw :: Account -> Int -> STM ()
withdraw acc amt = do
  balance <- readTVar amt
  if amt > 0 && amt > balance then
    retry
  else
    writeTVar acc (balance - amt)

Progress

Traditional deadlock scenarios are impossible, as is cyclic restarting where two transactions constantly cancel each other out
Starvation is possible, however uncommon in practice

Database Guarantees

Atomicity: each transaction should be ‘all or nothing’
Consistency: each transaction in the future sees the effects of transactions in the past
Isolation: the transaction’s effect on the state cannot be affected by other transactions
Durability: the transaction’s effect on the state survives power outages and crashes