Written by Luka Kerr on November 11, 2018
Intro
Programming languages are formal languages which allows us to:
- Design languages better and avoid undefined behaviour
- Make languages easier to process and parse
- Give mathematical meaning to programs allowing for formal verification
- Develop algorithms to find bugs automatically
- Rigorously analyse optimisations and other program transformations
Stages of a Compiler
Lexer
- Given a string of source code, produces a stream of tokens or lexems removing irrelevant information (comments, whitespace)
Parser
- Converts the stream of tokens from the lexer into a parse tree or abstract syntax tree
Semantic Analysis
- Checks variable scoping
- Type checking
- Adds extra information to the tree
Optimiser
- Loop unrolling, inlining
Code Generation
- Links runtime system
- Selects appropriate machine instructions
Backus-Naur Form
-
Allows for specification of grammar (the structure of lexems expected to produce certain parse trees)
-
Uses terminal and non-terminal symbols. Terminal symbols are the elementary symbols of the language, whereas non-terminal symbols are symbols that can be replaced
-
Example
funDef ::= Ident1 Ident2 ( args ) stmt stmt ::= expr ; | if ( expr ) stmt else stmt | return expr ; | while ( expr ) stmt stmts ::= epsilon | stmt stmts expr ::= Number | Ident | expr1 + expr2 | Ident == expr | Ident ( expr ) args ::= epsilon | ...
This translates to English as:
- A function definition consists of an indentifier, another identifier, a left paren, arguments, a right paren and a statement
- A statement consists of an expression followed by a semicolon OR an if keyword, a left paren, an expression, a right paren, a statement, an else, a statement OR a return, an expression, a semicolon OR a while, left paren, expression, right paren and a statement
- Multiple statements consist of an empty string OR a statement followed by multiple statements
- An expression consists of a number OR an identifier OR an expression, a plus, another expression OR an identifier, two equals, an expression OR an identifier, a left paren, an expression and a right paren
- Args consists of an empty string OR something else
Natural Deduction
Formalisation
- The process of giving a language a formal mathematical description
- For programming languages, the language is defined using another language called the meta language. For formalisations, the formalisations are described using a minimal logic called the meta logic
- A common meta logic to use is based on natural deduction and inductive inference rules
Judgements
-
A statement asserting a certain property for an object
-
To prove a judgement we can use inference rules, written as \(\dfrac{J_1 \quad J_2 \quad \dots{} \quad J_n}{J}\)
-
The above inference rule states that in order to prove judgement $J$ (the conclusion), it suffices to prove all judgements $J_1$ through to $J_n$ (the premises).
-
Rules with no premises are called axioms, and their conclusions always hold (are true).
-
For example, the natural numbers $\mathbb{N}$ \(\dfrac{}{0 \ Nat} N_1 \quad \dfrac{n \ Nat}{(S \ n) \ Nat} N_2\) The above inference rules state that $0$ is a natural number (axiom), and that if $n$ is a natural number, then $S \ n$ (the successor of $n$) is a natural number.
Derivability
- If we can add an inference rule that are compositions of existing rules, then the new rule is called derivable
- If a rule can be added to a language and the language definition doesn’t change, then the rule is admissible to the language
Hypothetical Derivations
- By writing rules in horizontal format ($\frac{A}{B} \equiv A \vdash B $), we can make rules premises of other rules
- For example $\frac{A \ \vdash\ B}{C}$ is read as: if assuming $A$ we can derive $B$, then we can derive $C$
Rule Induction
The process of using existing inference rules to prove a property in the language. The assumptions are called inductive hypotheses.
In general, given a set of rules $R$, we may prove a property $P$ inductively for all judgements that can be inferred with $R$ by showing, for each rule of the form $\frac{J_1 \quad J_1 \quad \dots{} \quad J_n}{J}$ that if $P$ holds for each of $J_1 . . J_n$, then $P$ holds for $J$
The base cases of induction are axioms, all other rules are inductive cases and the premises of each rule give rise to inductive hypotheses
Mathematical Induction Steps (revision)
- When doing induction there are a few steps to remember in the process
- State the proposition $P(n)$
- Show that the base case $P(\text{base case})$ is true
- State the inductive hypothesis that $P(n)$ is true
- Prove that $P(n) \implies P(n + 1)$, this is the inductive step
- Invoke induction by writing Q.E.D, or draw a $\Box$
Ambiguity
- A language is ambiguous when there are multiple ways to derive the same judgement
- To remove ambiguity, we can specify associativity
- Left-Associativity $A \circ B \circ C = (A \circ B) \circ C$
- Right-Associativity $A \circ B \circ C = A \circ (B \circ C)$
Syntax
Concrete syntax is described by judgements on strings, which describe the actual text input by the programmer. It is ambiguous and not suitable for compiler implementation and proofs
Abstract syntax is a representation of a program where extraneous information and ambiguity is removed. It is usually represented as a tree, known as an abstract syntax tree
Parsing
- The process of converting concrete to abstract syntax is called parsing
- The parsing relation $\longleftrightarrow$ is unambiguous
- The process of parsing, or determining the abstract syntax corresponding to a particular concrete syntax is as follows
- Derive the left hand side of the $\longleftrightarrow$ bottom up until reaching axioms
- Fill in the right hand side of the $\longleftrightarrow$ top down, starting at the axioms
- The inverse of parsing known as pretty-printing is the process of starting with the abstract syntax on the right hand side of $\longleftrightarrow$ and attempting to synthesise a concrete syntax on the left. This is non-deterministic as there are many ways to convert abstract to concrete syntax
Bindings
A program usually has variables, and where these variables are declared is known as the binding occurance of the variable. The scope of the variable is wherever the variable has a value and is defined. For example:
let x = 5 in -- binding occurance of x
let y = 2 in -- scope of x
x + y -- usage occurance of x
end -- end scope of x
end
The process of finding the binding occurance of each used variable is called scope resolution.
Shadowing
-
Shadowing occurs when a variable is declared inside another variable’s scope, and both have the same name, for example:
let x = 5 in let x = 2 in -- x is shadowed here x + x -- x + x = 4 end end
$\alpha$-equivalence
-
Expressions that are semantically identical but differ in bound variable names are $\alpha$-equivalent, for example:
let x = 5 in let x = 2 in x + x end end -- the above expression is alpha equivalent to the one below let a = 5 in let y = 2 in y + y end end
Substitution
- A variable $x$ is free in an expression $e$ if $x$ occurs in $e$ but is not bound in $e$
- A substitution, written $e[x := t]$ is the replacement of all free occurances of $x$ in $e$ with the term $t$
Variable Capture
- Capture can occur for a substitution $e[x := t]$ whenever there is a bound variable in the expression $e$ with the same name as a free variable occuring in $t$
- To avoid capture we can
- $\alpha$-rename the offending bound variable to an unused name, or
- If you have access to the free variable’s definition, rename the free variable, or
- Use De Bruijn indices
First Order Abstract Syntax
De Bruijn Indices
-
De Bruijn indices allow us to address the issue with substitution capture by
- Removing all identifier from binding expressions (e.g.
Let
) - Replace the identifier in a
Var
with a number indicating how many binders we must skip in order to find the binder for that variable
(Let "a" (Num 5) (Let "y" (Num 2) (Plus (Var "a") (Var "y")))) -- the above expression can be represented using De Bruijn indices as below (Let (Num 5) (Let (Num 2) (Plus (Var 1) (Var 0))))
- Removing all identifier from binding expressions (e.g.
-
To convert a piece of first order abstract syntax to De Bruijn indices, we can keep a stack of variable names, pushing onto the stack at each
Let
and popping after the variable goes out of scope. When a usage is encountered, replace the variable name with its first position in the stack
Higher Order Abstract Syntax
Higher order abstract syntax relies on the meta language to facilitate binding of variables. Because of this, many problems such as malformed syntax, capture and $\alpha$-equivalence is solved since it is handled by the meta language
For example, a term language $t$ can be defined as
t ::= Symbol -- symbols
| x -- variables
| t1 t2 -- application
| x. t -- binding or abstraction
$\lambda$-Calculus
Lambda calculus is a system using mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution
The term language $t$ above can be slightly changed:
t ::= Symbol -- symbols
| x -- variables
| t1 t2 -- application
| λx. t -- lambda abstraction
$\beta$-reduction
-
Beta reduction is used to evaluate terms, for example the function $(\lambda x . t)$ below denotes a function that given an argument for $x$, will return $t$ \((\lambda x . t) \ u \quad \mapsto_\beta \quad t[x := u]\)
-
$\beta$-reduction is a congruence, meaning that we can pick any reducible subexpression (called a redex) and perform a $\beta$-reduction. For example \(\begin{array}{ll} (\lambda x . \lambda y . f (y \ x)) \ 5 \ (\lambda x . x) & \mapsto_\beta \quad (\lambda y . f \ (y \ 5)) \ (\lambda x . x) \\ & \mapsto_\beta \quad f \ ((\lambda x. x) \ 5) \\ & \mapsto_\beta \quad f \ 5 \end{array}\)
$\eta$-conversion
- If we have a term $f$ and apply it to a variable, and then abstract over that same veriable, we end up with the original term
- For example $\lambda x . (f x) \mapsto_{\beta \eta} f$, or in Haskell,
(\x -> abs x)
is the same asabs
Confluence
- Suppose we arrive via one reduction path to an expression that cannot be reduced further (known as normal form), then any other reduction path will result in the same normal form
Equivalence
- Two terms are $\alpha \beta$-equivalent, written $s \equiv_{\alpha \beta}$ if they $\beta$-reduce to $\alpha$-equivalent normal forms
Semantics
Static Semantics
- Aspects of a program that can be determined by a compiler or external tool (linter) without running the program
- Two main aspects include types and scope
Dynamic Semantics
-
Can be specified in many ways:
- Denotational semantics is the compositional construction of a mathematical object for each form of syntax
- Axiomatic semantics is the construction of a proof calculus to allow correctness of a program to be verified
- Operational semantics is used in this course and is the construction of a program-evaluating state machine or transition system
-
Two main aspects of dynamic semantics include behaviour and cost
-
Operational semantics has two main kinds
- Small Step
- A judgement that specifies transitions between states: $e \mapsto e’$
- We need a set of states $\Sigma$, a set of initial states $I \sube \Sigma$, a set of final states $F \sube \Sigma$ and a relation $\mapsto \sube \Sigma \times \Sigma$ which specifies only “one step” of the execution
- For example, where $\Sigma$ and $I$ are the set of all closed expressions ${ e \ | \ e \ \bold{ok} }$ and $F$ is the set of evaluated expressions ${ (Num \ n) \ | \ n \in \mathbb{Z} }$ \(\dfrac{e_1 \mapsto e'_1}{(Plus \ e_1 \ e_2) \mapsto (Plus \ e'_1 \ e_2)} \\ \dfrac{e_2 \mapsto e'_2}{(Plus \ (Num \ n) \ e_2) \mapsto (Plus \ (Num \ n) \ e'_2)} \\ \dfrac{}{(Plus \ (Num \ n) \ (Num \ m)) \mapsto (Num \ (n + m))}\)
- Big Step
- One big judgement relating expressions to their values: $e \Downarrow v$
- We need a set of evaluable expressions $E$, a set of values $V$ and a relation $\Downarrow \sube E \times V$
- For example, where $E$ is the set of all closed expressions ${e \ | \ e \ \bold{ok} }$, and $V$ is the set of integers $\mathbb{Z}$ \(\begin{array}{cc} \dfrac{}{(Num \ n) \Downarrow n} & \dfrac{e_1 \Downarrow v_1 \quad e_2[x := (Num \ v_1)] \Downarrow v_2}{(Let \ e_1 (x. e_2)) \Downarrow v_2} \\ \\ \dfrac{e_1 \Downarrow v_1 \quad e_2 \Downarrow v_2}{(Plus \ e_1 \ e_2) \Downarrow (v_1 + v_2)} & \dfrac{e_1 \Downarrow v_1 \quad e_2 \Downarrow v_2}{(Times \ e_1 \Downarrow e_2) \Downarrow (v_1 \times v_2)} \end{array}\)
- Small Step
Evaluation Strategies
- Call-by-value or strict evaluation, shown above, is where an expression is evaluated and bound to a corresponding variable
- Call-by-name is when an expression is replaced when needed in another expression
- Call-by-need or lazy evaluation, is when an expression is computed only if its value is needed
Equivalence
- Small step semantics are lower level, they clearly specify the order of evaluation. Big step semantics gives us a result without telling us explicitly how it was computed
- Small and big step semantics can be proven to be equivalent, that is if there exists a trace $e \mapsto \dots{} \mapsto (Num \ n)$, then $e \Downarrow n$, and vice versa
- Let $\overset{*}{\mapsto}$ be the reflexive, transitive closure of $\mapsto$, then \(\begin{array}{ll} \dfrac{}{e \overset{*}{\mapsto} e} & \dfrac{e_1 \mapsto e_2 \ \ \ \ \ e_2 \overset{*}{\mapsto} e_n}{e_1 \overset{*}{\mapsto} e_n} \end{array}\) Now we can formally state the property of equivalence as $e \overset{*}{\mapsto} (Num \ n) \iff e \Downarrow n$
Functional Programming
A programming language derived from or inspired by the $\lambda$-calculus, or derived from or inspired by another functional programming language
Purely Functional
- A programming language is purely functional if $\beta$-reduction is a confluence
- In other words, functions have to be mathematical functions, and free of side effects
Cost Models
Operational Cost Models
- First, we define a program-evaluating abstract machine. We can determine the time cost by counting the number of steps taken by the abstract machine
- One step in the abstract machine should always only be $\mathcal{O} (1)$
Control Flow
To define an abstract machine where all the rules are axions, we use a stack.
Stack
\[\begin{array}{l} \dfrac{}{\circ \ \textbf{Stack}} \quad \dfrac{f \ \textbf{Frame} \quad s \ \textbf{Stack}}{f \triangleright s \ \textbf{Stack}} \end{array}\]- Initial states are those that start evaluating an expression from an empty stack, i.e. $\circ \succ e$
- Final states are those that return a value to the empty stask, i.e. $\circ \prec v$
- Stack frames are expressions with holes of values in them, i.e. $\text{Plus} \ v_1 \ \square$
Example
\[\begin{array}{ll} & \circ \succ (\text{Plus (Plus (Num 2) (Num 3)) (Num 4)}) \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \succ \text{(Plus (Num 2) (Num 3))} \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 3))} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \succ \text{(Num 2)} \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 3))} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \prec 2 \\ \mapsto & \text{(Plus 2} \ \square \ \text{)} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \succ \text{(Num 3)} \\ \mapsto & \text{(Plus 2} \ \square \ \text{)} \triangleright \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \prec 3 \\ \mapsto & \text{(Plus} \ \square \ \text{(Num 4))} \triangleright \circ \prec 5 \\ \mapsto & \text{(Plus 5} \ \square \ \text{)} \triangleright \circ \succ \text{(Num 4)} \\ \mapsto & \text{(Plus 5} \ \square \ \text{)} \triangleright \circ \prec 4 \\ \mapsto & \circ \prec 9 \\ \end{array}\]Environments
- An environment is a context containing equality assumptions about variables
- It can be viewed like a state that maps variables to their values, except that a variable’s value does not change over time
Safety and Liveness
Safety
A safety property states that something bad does not happen. For example:
I will never run out of money
Liveness
A liveness property states that something good will happen. For example
If I start saving money now, eventually I will be rich
Type Safety
A property that states that well-typed programs do not go wrong
Progress and Preservation
To prove that a well-typed program either goes on forever or reaches a final state, we use the following two lemmas
- Progress
- Well-typed states are not stuck states. That is, if an expression $e : \tau$ then either $e$ is a final state or there exists a state $e’$ such that $e \mapsto e’$
- Preservation
- Evaluating one step preserves types. That is, if an expression $e : \tau$ and $e \mapsto e’$, then $e’ : \tau$
For example:
\[e_1 : \tau \mapsto \underbrace{e_2: \tau}_{\text{preservation}} \overbrace{\mapsto e_3 }^{\text{progress}} : \tau \mapsto \dots{} \\\]Composite Data Types
Unit Types - $()$
A type that has just one value - $\textbf{1}$
Product Types - $\tau_1 \times \tau_2$
Data that is stored in one value as a pair. “$\tau_1$ and $\tau_2$”
\[\dfrac{e : \tau_1 \times \tau_2}{\text{fst} \ e : \tau_1} \quad \dfrac{e : \tau_1 \times \tau_2}{\text{snd} \ e : \tau_2}\]Sum Types - $\tau_1 + \tau_2$
Data that may be one of two forms. “$\tau_1$ or $\tau_2$”
\[\dfrac{e : \tau_1}{\text{InL} \ e : \tau_1 + \tau_2} \quad \dfrac{e : \tau_2}{\text{InR} \ e : \tau_1 + \tau_2} \quad \dfrac{e : \tau_1 + \tau_2 \qquad x : \tau_1, e_1 : \tau \qquad y : \tau_2, e_2 : \tau}{\textbf{case} \text{ e } \textbf{of} \text{ InL x} \to e_1; \text{InR y} \to e_2}\]Empty Type - $0$
A type that cannot be constructed, and will never be evaluated
Recursive Types - $\textbf{rec} \ t. \tau$
A type that allows reference to the entire type recursively
\[\dfrac{e : \tau [t := \textbf{rec} \ t. \tau]}{\text{roll} \ e : \textbf{rec} \ t. \tau} \quad \dfrac{e : \textbf{rec} \ t. \tau}{\text{unroll} \ e : \tau [t := \textbf{rec} \ t. \tau]}\]Curry-Howard Correspondence
\[\begin{array}{ll} \times & \land \\ \text{+} & \lor \\ \to & \implies \\ 1 & \top \\ 0 & \bot \\ \forall & \forall \end{array}\]Polymorphism
A way to specify that we don’t care what data type is used (also known as generics)
Type Abstraction
The ability to define functions regardless of specific types.
swap = type a. type b.
recfun swap :: (a, b) -> (b, a) x = (snd x, fst x)
Type Application
The ability to instantiate polymorphic functions to specific types.
swap@Int@Bool (3, True)
Generality
- A type $\tau$ is more general than a type $\rho$, often written $\rho \sqsubseteq \tau$, if type variables in $\tau$ can be instantiated to the given type $\rho$
- For example: $\text{Int} \to \text{Int} \sqsubseteq \forall z. z \to z \sqsubseteq \forall x \ y. x \to y \sqsubseteq \forall a. a$
Implementation
Template Instantiation
Automatically generate a monomorphic copy of each polymorphic function based on the types applied to it.
For example for a type application like swap@Int@Bool
swap = type a. type b.
recfun swap :: (a, b) -> (b, a) x = (snd x, fst x)
would be replaced statically by
swap = recfun swap :: (Int, Bool) -> (Bool, Int) x = (snd x, fst x)
Advantages | Disadvantages |
---|---|
Little to no run-time cost | Large binary size if many instantiations are used |
Easy to implement | Can lead to long compilation times |
Boxing
All data types are represented as a pointer to a data structure on the heap, where all pointers are exactly 32 or 64 bits large.
Parametricity
The principle of parametricity states that the result of polymorphic functions cannot depend on values of an abstracted type.
More formally, given a polymorphic function $g$ that takes a type parameter, if the function $f : \tau \to \tau$ was run on some values of type $\tau$, then run on $g\text{@} \tau$ on the result, that will give the same results as running $g\text{@} \tau$ first, then $f$.
Examples
head :: [a] -> a
-- parametric example
f (head list) = head (map f list)
concat :: [[a]] -> [a]
-- parametric example
map f (concat list) = concat (map (map f) list)
Unification
A substitution $S$ to unification variables is a unifier of two types $\tau$ and $\rho$ if $S_\tau = S_\rho$.
Furthermore, it is the most general unifier of $\tau$ and $\rho$ if there is no other unifier $S’$ where $S_\tau \sqsubseteq S’ \tau$.
We write $\tau \stackrel{U}{\sim} \rho$ if $U$ is the most general unifier of $\tau$ and $\rho$.
Examples
\[\begin{array}{lll} 1) & \alpha \times (\alpha \times \alpha) \sim \beta \times \gamma & \alpha = \beta, \gamma = (\alpha \times \alpha) \\ & \alpha \times (\alpha \times \alpha) \sim \alpha \times (\alpha \times \alpha) \\ \\ 2) & (\alpha \times \alpha) \times \beta \sim \beta \times \gamma & \beta = (\alpha \times \alpha), \gamma = \beta \\ & (\alpha \times \alpha) \times (\alpha \times \alpha) \sim (\alpha \times \alpha) \times (\alpha \times \alpha) \\ \\ 3) & \text{Int} + \alpha \sim \alpha + \text{Bool} \\ & \text{TypeError} \\ \\ 4) & (\alpha \times \alpha) \times \alpha \sim \alpha \times (\alpha \times \alpha) \\ & \text{No Unifier} \\ \end{array}\]Linear Types
$A \multimap B \quad$ $A$ can be transformed into $B$
$A \otimes B \quad$ You’ve got both $A$ and $B$
$A \oplus B \quad$ You’ve got either $A$ or $B$
$A \text{ \& } B \quad$ You can pick from $A$ or $B$
$!A \qquad$ You’ve got an unlimited amount of $A$
Abstract Data Types
A type defined not by its internal representation, but by the operations that can be performed on it.
Existential Types
$\forall a. \tau$ | $\exists a . \tau$ |
---|---|
When producing a value, $a$ is an arbitrary, unknown type. When consuming a value, $a$ may be instantiated to any desired type. | When consuming a value, $a$ is an arbitrary, unknown type. When producing a value, $a$ may be instantiated to any desired type. |
Bag ADT Example
- Specified by three operations
emptyBag
- $\mathcal{B}$addToBag
- $(\mathcal{B} \to \text{Int} \to \mathcal{B})$average
- $(\mathcal{B} \times \text{Int})$
- The type for this ADT is $\exists \mathcal{B} . \mathcal{B} \times (\mathcal{B} \to \text{Int} \to \mathcal{B}) \times (\mathcal{B} \to \text{Int})$
Overloading
Type Classes
- A set of types for which implementations have been provided for various functions, called methods
- For example, in Haskell the types
Int
,Float
etc are all instances of the type classNum
which has methods such as+
,negate
etc - We write $f :: \forall a . P \Rightarrow \tau$ to indicate that $f$ has the type $\tau$ where $a$ can be instantiated to any type under the condition that the constraint $P$ is satisfied. Typically, $P$ is a list of instance constrains, such as
Num a
orEq b
- For example, $(+) :: \forall a . (\text{Num} \ a) \Rightarrow a \to a \to a$
Dictionaries and Resolution
-
The type checker will convert ad-hoc polymorphism (type classes) to parametric polymorphism
-
Type classes are converted to types
class Eq a where (==) : a -> a -> Bool (/=) : a -> a -> Bool -- becomes type EqDict a = (a -> a -> Bool, a -> a -> Bool)
-
Instances become values of the dictionary type
instance Eq Bool where True == True = True False == False = False _ == _ = False a /= b = not (a == b) -- becomes True ==(Bool) True = True True ==(Bool) False = True _ ==(Bool) _ = False a /=(Bool) b = not (a ==(Bool) b) eqBoolDict = ((==(Bool)), (/=(Bool)))
-
Programs that rely on overloading now take dictionaries as parameters
Subtyping
To add subtyping to a language, we define a partial order on types $\tau \le \rho$ and a rule of subsumption
\[\dfrac{e : \tau \qquad \tau \le \rho}{e : \rho}\]What this partial order means is up to the language. There are two main approaches
- Most common: where upcasts do not have dynamic behaviour
- Uncommon: where upcasts cause a coercion to occur, actually converting the value from $\tau$ to $\rho$ at runtime
Product Types
$\dfrac{\tau_1 \le \rho_1 \qquad \tau_2 \le \rho_2}{(\tau_1 \times \tau_2) \le (\rho_1 \times \rho_2)}$
Example: $(\text{Int} \times \text{Int}) \le (\text{Float} \times \text{Int})$
Sum Types
$\dfrac{\tau_1 \le \rho_1 \qquad \tau_2 \le \rho_2}{(\tau_1 + \tau_2) \le (\rho_1 + \rho_2)}$
Example: $(\text{Int} + \text{Float}) \le (\text{Float} + \text{Float})$
Functions
$\dfrac{\rho_1 \le \tau_1 \qquad \tau_2 \le \rho_2}{(\tau_1 \to \tau_2) \le (\rho_1 \to \rho_2)}$
Example: $(\text{Float} \to \text{Int}) \le (\text{Int} \to \text{Float})$
Variance
- The way a type constructor ($+, \times, \to$) interacts with subtyping is called its variance
- For a type constructor $C$ and $\tau \le \rho$
- If $C \tau \le C \rho$ then $C$ is covariant
- If $C \rho \le C \tau$ then $C$ is contravariant
- If it is neither covariant not contravariant, then it is called invariant
Parallelism
Parallelism is the simultaneous execution of code on multiple processors or cores for the purposes of improved performance. E.g. simulations, neural networks.
Concurrency is an abstraction for the programmer, allowing programs to be structured as multiple threads of control, called processes. E.g. servers, GUI applications, kernels.
Types
- Data Parallelism
- Also known as single instruction, multiple data (SIMD)
- The same code is executed on multiple processing units, working on different parts of a large data set
- Examples include GPUs
- Control Parallelism
- Also known as task parallelism or multiple instruction, multiple data (MIMD)
- Different programs are executed on different processing units
- Examples include CPUs
Cost Models
-
Work cost is the overall number of steps required to execute the program. It corresponds to time cost on a processor with just one core
-
Depth cost is the minimal number of parallel steps required to execute the program. It corresponds to the time cost on a processor with unlimited cores
-
$\llbracket e \rrbracket_\mathcal{W}$ denotes work cost and $\llbracket e \rrbracket_\mathcal{D}$ denotes depth cost for an expression $e$
- For sequential operations the two measures coincide. For example:
- $\llbracket \text{len} \ e \rrbracket_\mathcal{W} = \llbracket e \rrbracket_\mathcal{W} + 1$
- $\llbracket \text{len} \ e \rrbracket_\mathcal{D} = \llbracket e \rrbracket_\mathcal{D} + 1$
- For parallel operations, we get different results for each measure. For example:
- $\llbracket e_r | x_1 \leftarrow e_1 | \dots{} | x_n \leftarrow e_n \rrbracket_\mathcal{W} = \llbracket e_r \rrbracket_\mathcal{W} + \sum_i \llbracket e_i \rrbracket_\mathcal{W}$
- $\llbracket e_r | x_1 \leftarrow e_1 | \dots{} | x_n \leftarrow e_n \rrbracket_\mathcal{D} = \llbracket e_r \rrbracket_\mathcal{D} + \text{max}_i \llbracket e_i \rrbracket_\mathcal{D}$
- For sequential operations the two measures coincide. For example:
Critical Sections
A critical section is a group of multiple steps, grouped into one single atomic step.
We want to ensure two main properties:
- Mutual exclusion: no two processes are in their critical section at the same time
- Eventual entry: once it enters its pre-protocol, a process will eventually be able to execute its critical section
Locks
A lock is a common mean of concurrency control. It is typically abstracted into two operations
- Taking the lock
- Releasing the lock
Software Transactional Memory
Each transaction is executed optimistically, rather than pessimistically (i.e. locks).
Each transaction has an associated log, which contains
- The values written to any
TVars
withwriteTVar
- The values read from any
TVars
withreadTVar
, consulting earlier log entries first
First the log is validated, and, if validation succeeds, changes are comitted. Validation and commit are one atomic step. If validation fails, the transaction is re-run.
Example
withdraw :: Account -> Int -> STM ()
withdraw acc amt = do
balance <- readTVar amt
if amt > 0 && amt > balance then
retry
else
writeTVar acc (balance - amt)
Progress
- Traditional deadlock scenarios are impossible, as is cyclic restarting where two transactions constantly cancel each other out
- Starvation is possible, however uncommon in practice
Database Guarantees
- Atomicity: each transaction should be ‘all or nothing’
- Consistency: each transaction in the future sees the effects of transactions in the past
- Isolation: the transaction’s effect on the state cannot be affected by other transactions
- Durability: the transaction’s effect on the state survives power outages and crashes