Written by Luka Kerr on November 30, 2020


Theorem Provers

A theorem prover is the implementation of a formal logic on a computer. It can be:

It also:

Theorem proving allows us to analyse systems and programs thoroughly, find design and specification errors early and give assurance that programs are correct.

This course will use Isabelle – a generic interactive proof assistant. This means it is:

Although, just because you prove something on a computer, does not make it necessarily correct. This is because:

Meta Logic

The meta logic is a logic used to formalize another logic. Isabelle has a meta logic, specifically:


The $\lambda$-calculus was created by Alonzo Church. It was originally meant as foundation of mathematics, is important applications in theoretical computer science and is the foundation of computability and functional programming.

Untyped $\lambda$-Calculus

The untyped $\lambda$-calculus is a turing complete model of computation that provides a simple way of writing down functions.

For example, instead of writing f(x) = x + 5, we write $f = \lambda x. \ x + 5$ which is a term and a nameless function that adds 5 to its parameter.

Function Application

To apply arguments to functions, we write $f \ a$, for example $(\lambda x. \ x + 5) \ a$. To evaluate this term, we replace $x$ by $a$.

Formal Specification

Formally, the untyped lambda calculus can be defined as:

\[\begin{array}{lclr} t & ::= & v \ | \ c \ | \ (t \ t) \ | \ (\lambda x. \ t) & (\text{terms}) \\ v, x & \in & V & (\text{variables}) \\ c & \in & C & (\text{constants}) \\ \end{array}\]




The process of $\beta$-reduction is simply to continuously replace the parameter by argument until you reach a point where you cannot anymore, known as the normal form.

$\beta$-conversion is defined as $s =_\beta t$ iff $\exists n. s \longrightarrow_\beta^\ast n \land t \longrightarrow_\beta^\ast n$, where $\longrightarrow_\beta^\ast$ is the transitive, reflexive closure of $\longrightarrow_\beta$.

A term $t$ is reducible if there is an $s$ such that $t \longrightarrow_\beta s$.


$s =_\eta t$ iff $\exists n. s \longrightarrow_\eta^\ast n \land t \longrightarrow_\eta^\ast n$, where $\longrightarrow_\beta^\ast$ is the transitive, reflexive closure of $\longrightarrow_\beta$.

Free Variables

In $(\lambda x. \ t)$, $x$ is a bound variable.

The free variables of a term are:

A term $t$ is called closed if $FV(t) = \{ \}$.


When substituting, if there are name collisions we simply rename bound variables.


$s =_\alpha t$ means $s = t$ up to renaming of bound variables.


Also, $s =_\alpha t$ iff $s \longrightarrow_\alpha^\ast t$, where $\longrightarrow_\alpha^\ast$ is the transitive, reflexive closure of $\longrightarrow_\alpha$.

Typed $\lambda$-Calculus


\[\begin{array}{lclr} t & ::= & v \ | \ c \ | \ (t \ t) \ | \ (\lambda x. \ t) & (\text{terms}) \\ v, x & \in & V & (\text{variables}) \\ c & \in & C & (\text{constants}) \\ \end{array}\]


\[\begin{array}{lclr} \tau & ::= & b \ | \ v \ | \ \tau \Rightarrow \tau & (\text{types}) \\ b & \in & \{ \texttt{bool}, \texttt{int}, \dots \} & (\text{base types}) \\ v & \in & \{ \alpha, \beta, \dots \} & (\text{type variables}) \\ \end{array}\]

Context $\Gamma$: a function from variable and constant names to types.

A term $t$ has a type $\tau$ in context $\Gamma$: $\Gamma \vdash t :: \tau$. Such a term is well typed if there are $\Gamma$ and $\tau$ such that $\Gamma \vdash t :: \tau$ holds.

Type Checking Rules


$\dfrac{}{\Gamma \vdash x :: \Gamma(x)}$


$\dfrac{\Gamma \vdash t_1 :: \tau_2 \Rightarrow \tau \quad \Gamma \vdash t_2 :: \tau_2}{\Gamma \vdash (t_1 \ t_2) :: \tau}$


$\dfrac{\Gamma[x \gets \tau_x] \vdash t :: \tau}{\Gamma \vdash (\lambda x. \ t) :: \tau_x \Rightarrow \tau}$

Most General Types

Each type correct term has a most general type. We have

\[\Gamma \vdash t :: \tau \Longrightarrow \exists \sigma . \Gamma \vdash t :: \sigma \land (\forall \sigma^\prime. \Gamma \vdash t :: \sigma^\prime \Longrightarrow \sigma^\prime \lesssim \sigma)\]

Subject Reduction

Well typed terms stay well typed during $\beta$-reduction:

\[\Gamma \vdash s :: \tau \land s \longrightarrow_\beta t \Longrightarrow \Gamma \vdash t :: \tau\]

Natural Deduction


\[\begin{array}{ccc} \rule[-4ex]{0pt}{10ex} \dfrac{A \quad B}{A \land B} \texttt{conjI} & \dfrac{A}{A \lor B} \texttt{disjI1} & \dfrac{B}{A \lor B} \texttt{disjI2} \\ \rule[-4ex]{0pt}{10ex} \dfrac{A \land B \quad \llbracket A; B \rrbracket \implies C}{C} \texttt{conjE} & \dfrac{A \implies B}{A \longrightarrow B} \texttt{impI} & \dfrac{A \implies False}{\neg A} \texttt{notI} \\ \rule[-4ex]{0pt}{10ex} \dfrac{A \lor B \quad A \implies C \quad B \implies C}{C} \texttt{disjE} & \dfrac{A \longrightarrow B \quad A \quad B \implies C}{C} \texttt{impE} & \dfrac{\neg A \quad A}{P} \texttt{notE} \\ \rule[-4ex]{0pt}{10ex} \dfrac{A \implies B \quad B \implies A}{A = B} \texttt{iffI} & \dfrac{A = B}{A \implies B} \texttt{iffD1} & \dfrac{A = B}{B \implies A} \texttt{iffD2} \\ \rule[-4ex]{0pt}{10ex} \dfrac{A = B \quad \llbracket A \longrightarrow B; B \longrightarrow A \rrbracket \implies C}{C} \texttt{iffE} & \dfrac{}{True} \texttt{TrueI} & \dfrac{False}{P} \texttt{FalseE} \\ \rule[-4ex]{0pt}{10ex} \dfrac{}{t = t} \texttt{refl} & \dfrac{s = t}{t = s} \texttt{sym} & \dfrac{r = s \quad s = t}{r = t} \texttt{trans} \\ \rule[-4ex]{0pt}{10ex} \dfrac{s = t \quad P \ s}{P \ t} \texttt{subst} & \dfrac{\neg A \implies False}{A} \texttt{ccontr} & \dfrac{\neg A \implies A}{A} \texttt{classical} \\ \rule[-4ex]{0pt}{10ex} \dfrac{}{P = True \lor P = False} \texttt{T-or-F} & \dfrac{}{P \log \neg P} \texttt{excluded-middle} & \dfrac{\bigwedge x. \ P \ x}{\forall x. \ P \ x} \texttt{allI} \\ \rule[-4ex]{0pt}{10ex} \dfrac{\forall x. \ P \ x \quad P \ ?x \implies R}{R} \texttt{allE} & \dfrac{\exists x. \ P \ x \quad \bigwedge x. \ P \ x \implies R}{R} \texttt{exE} & \dfrac{P \ ?x}{\exists x. \ P \ x} \texttt{exI} \\ \rule[-4ex]{0pt}{10ex} \dfrac{P \land Q}{P} \texttt{conjunct1} & \dfrac{P \land Q}{Q} \texttt{conjunct2} & \dfrac{P \longrightarrow Q \quad P}{Q} \texttt{mp} \\ \rule[-4ex]{0pt}{10ex} \dfrac{\forall x. \ P \ x}{P \ ?x} \texttt{spec} & & \end{array}\]

Safe & Unsafe Rules

Safe rules preserve provability: conjI, impI, notI, iffI, refl, ccontr, classical, conjE, disjE, allI, exE.

Unsafe rules can turn a provable goal into an unprovable one: disjI1, disjI2, impE, iffD1, iffD2, notE, allE, exI.

Safe rules should always be applied before unsafe ones.

Higher Order Logic

Default Types:

Default Constants:



\[\begin{array}{ccc} \rule[-4ex]{0pt}{10ex} \dfrac{}{P = True \lor P = False} \texttt{T-or-F} & \dfrac{}{t = t} \texttt{refl} & \dfrac{s = t \quad P \ s}{P \ t} \texttt{subst} \\ \rule[-4ex]{0pt}{10ex} \dfrac{}{(P \longrightarrow Q) \longrightarrow (Q \longrightarrow P) \longrightarrow (P = Q)} \texttt{iff} & \dfrac{P \longrightarrow Q \quad P}{Q} \texttt{mp} & \dfrac{P \implies Q}{P \longrightarrow Q} \texttt{impI} \\ \rule[-4ex]{0pt}{10ex} \dfrac{}{\exists f :: \texttt{ind} \Rightarrow \texttt{ind}. \ \text{inj} \ f \land \neg \text{surj} \ f} \texttt{infty} & \dfrac{\bigwedge x. \ f \ x = g \ x}{(\lambda x. \ f \ x) = (\lambda x. \ g \ x)} \texttt{ext} & \dfrac{P \ ?x}{P (\texttt{SOME} \ x. \ P \ x)} \texttt{someI} \end{array}\]


A typical Isar proof looks like:

  assume formula0
  have formula1 by simp
  have formulan by blast
  show formula(n + 1) by ...

which proves formula0 implies formula(n + 1).



Backward & Forward

General case: from A1 ... An have R proof

Fix & Obtain


Moreover & Ultimately

Term Rewriting

Given a set of equations $l_1 = r_1$, $l_2 = r_2$, …, $l_n = r_n$, does equation $l = r$ hold?

The idea of term rewriting is to use equations as reduction rules, i.e. $l_1 \longrightarrow r_1$, $l_2 \longrightarrow r_2$, …, $l_n \longrightarrow r_n$ to help decide whether $l = r$ by deciding $l \overset{\ast}{\longleftrightarrow} r$.

Arrow Definitions

Deciding $l \overset{\ast}{\longleftrightarrow} r$

To decide whether $l \overset{\ast}{\longleftrightarrow} r$ holds, we look for an $n$ such that $l \overset{\ast}{\longrightarrow} n$ and $r \overset{\ast}{\longrightarrow} n$.

Although, there may not be a suitable $n$. This decision only works for systems where the Church-Rosser property holds:

\[l \overset{\ast}{\longleftrightarrow} r \implies \exists n. \ l \overset{\ast}{\longrightarrow} n \land r \overset{\ast}{\longrightarrow} n\]


If terms in a system can be rewritten in more than one way to yield the same result, then the system is confluent.


When each rule application makes terms simpler in some way, then $\longrightarrow$ is terminating. More formally, $\longrightarrow$ is terminating when there is a well founded order $<$ on terms for which $s < t$ whenever $t \longrightarrow s$. An order is well founded if there is no infinite decreasing chains $a_1 > a_2 > \dots$.

In practice, it is often easier to consider just the rewrite rules by themselves, rather than their application to an arbitrary term $t$. To do this, we show for each rule $l_i = r_i$, that $r_i < l_i$. This requires a term $u$ to become smaller whenever any subterm of $u$ is made smaller. Formally, this means that $<$ is required to be monotonic with respect to the structure of terms: $s < t \longrightarrow u[s] < u[t]$. This is true for most orders that don’t treat certain parts of terms as special cases.

Term Rewriting In Isabelle

The term rewriting engine in Isabelle is called the simplifier. To use it, we write apply simp. The simplifier uses simplification rules and applies them from left to right until no rule is applicable. Both termination and confluence is not guaranteed.

Equations can be turned into simplication rules by adding the [simp] attribute, or they can be added/deleted locally using apply (simp add: <rules>) or apply (simp del: <rules>). Certain rules can be applied alone using apply (simp only: <rules>).

Applying A Rewrite Rule

A rule $l \longrightarrow r$ is applicable to a term $t[s]$ if there is a substitution $\sigma$ such that $\sigma \ l = s$. The resulting term is $t[\sigma \ r]$, and we have $t[s] = t[\sigma \ r]$.

Conditional Term Rewriting

Rewrite rules can be conditional, where $\llbracket P_1, \dots, P_n \rrbracket \implies l = r$ is applicable to a term $t[s]$ with $\sigma$ if $\sigma \ l = s$ and $\sigma \ P_1, \dots, \sigma \ P_n$ are provable by rewriting.

Rewriting With Assumptions

Isabelle uses assumptions in rewriting, this can lead to non-termination. There are options we can pass to simp to disable this:

Orthogonal Rewriting Systems

A system is orthogonal if it is left-linear and has no critical pairs. Orthogonal rewrite systems are confluent.

Specification Techniques


Type 'a set sets over type 'a.

Introducing Theorems

Introducing Types

Inductive Definitions

The rules

\[\dfrac{a_1 \in X \quad \dots \quad a_n \in X}{a \in X} \ \text{with} \ a_1, \dots, a_n, a \in A\]

define a set $X \subseteq A$.

Given a set of rules $R \subseteq A \ \text{set} \times A$, we can apply $R$ to a set $B$: $\hat{R} \ B \equiv \{ x . \ \exists H. \ (H, x) \in R \land H \subseteq B \}$.

The set $B$ is $R$-closed iff $\hat{R} \ B \subseteq B$, and $X$ is the least $R$-closed subset of A.

Rule Induction

\[\dfrac{\forall(\{ a_1, \dots, a_n \}, a) \in R. \ P \ a_1 \land \dots \land P \ a_n \implies P \ a}{\forall x \in X. \ P \ x}\]


\[\begin{array}{lcl} \mathbf{datatype} \ (\alpha_1, \dots, \alpha_n) \ \tau & = & C_1 \ \tau_{1,1} \dots \tau_{1,n_1} \\ & | & \dots \\ & | & C_k \ \tau_{k,1} \dots \tau_{k,n_k} \end{array}\]

Datatypes must be definable as a set, and can’t have nested recursion.

Every datatype introduces a case construct, where in general there is one case per constructor. The syntax apply (case_tac t) creates $k$ subgoals, one for each constructor.


In HOL, all functions must be total. Primitive recursion functions in Isabelle guarantee termination structurally. A primitive recursive function can be introduced with primrec.


Structural Induction

$P \ xs$ holds for all lists $xs$ if

A general proof method for induction is (induct x), where x must be a free variable in the first subgoal, and the type of x must be a datatype.

Theorems about recursive functions are proved by induction.


Defining a recursive function using fun allows for high expressiveness, but a termination proof may fail. In genera, fun is

Each fun definition induces an induction principle. For each equation, show $P$ holds for the left hand side, provided $P$ holds for each recursive call on the right hand size.

Isabelle tries to prove termination automatically, which works in most cases but may fail. In this case, termination can be proved seperately.




We don’t want to trust the external provers. Need to check/reconstruct proof.


Testing cannot prove theorems, but it can refute conjectures. Usage with quickcheck


Lightweight validation by testing.

Covers a number of testing approaches:


Finite model finder based on SAT. Soundly approximates infinite types. Usage with nitpick.

Hoare Logic

The idea of Hoare Logic is to describe the meaning of program by pre/post conditions. Together, a program with these pre and post conditions is known as a Hoare Triple:

\[\{ P \} \quad C \quad \{ Q \}\]

To prove properties about programs, we have rules that directly work on such triples.

Partial correctness: $\models \{ P \} \ C \ \{ Q \} \equiv \forall \sigma \ \sigma^\prime. \ P \ \sigma \land \langle c, \sigma \rangle \to \sigma^\prime \longrightarrow Q \ \sigma^\prime$.

Total correctness: $\models \{ P \} \ C \ \{ Q \} \equiv (\forall \sigma \ \sigma^\prime. \ P \ \sigma \land \langle c, \sigma \rangle \to \sigma^\prime \longrightarrow Q \ \sigma^\prime) \land (\forall \sigma. \ P \ \sigma \longrightarrow \exists \sigma^\prime. \ \langle c, \sigma \rangle \to \sigma^\prime)$

Soundness: $\vdash \{ P \} \ C \ \{ Q \} \implies \models \{ P \} \ C \ \{ Q \}$.

Soundness can be proved by rule induction on $\vdash \{ P \} \ C \ \{ Q \}$.


For programs with loops, we need to find a correct invariant $P$, with which we annotate the program with so that Hoare rules can be applied automatically.

Weakest Preconditions

\[\mathbf{pre} \ C \ Q = \mathbf{weakest} \ P \ \mathbf{such \ that} \ \{ P \} \ C \ \{ Q \}\]

Verification Conditions

The triple $\{ \mathbf{pre} \ C \ Q \} \ C \ \{ Q \}$ is only true under certain conditions, called verification conditions $vc \ C \ Q$. We also have $vc \ C \ Q \land (P \implies \mathbf{pre} \ C \ Q) \implies \{ P \} \ C \ \{ Q \}$.

Proofs About Programs

Deep Embeddings

A deep embedding is a seperate representation of language terms and their semantics.

This allows for proof of general theorems about the language, not just of programs, usually by structural induction over the syntax type.

A disadvantage is that semantically equivalent programs are not obviously equal, and many concepts already present in the logic must be reinvented.

Shallow Embeddings

A shallow embedding represents only the semantics, directly in the logic. It includes a definition for each language construct, giving its semantics. Programs are represented as instances of these definitions.

A shallow embedding is suitable for (a useful fragment of) C.

State Monad

The state monad can express lots of C ideas:

A tool called AutoCorres produces a verified translation from deeply embedded C to a monadic representation specifically designed for humans to do proofs over.

The state monad models the semantics of a deterministic computation as a function: $s \Rightarrow (\alpha \times s)$. The computation operates over a state of type $s$ which includes all global variables, external devices etc. The computation also yields a return value of type $\alpha$ which models exit statuses, return values etc.

A function return is provided that represents the computation that leaves the state unchanged and returns its argument: $\texttt{return} \ x \equiv \lambda s. \ (x, s)$.

Basic Operations


Formally, a monad $M$ is a type constructor with two operations: $\texttt{return} :: \alpha \Rightarrow M \ \alpha$ and $\texttt{bind} :: M \ \alpha \Rightarrow (\alpha \Rightarrow M \ \beta) \Rightarrow M \ \beta$.

Also, a >>= b is infix notation for bind a b, and a >>= (\x. b x) is often written as do { x <- a; b x }.

The laws are:

State Monad With Failure

$s \Rightarrow ((\alpha \times s) \times \texttt{bool})$

Nondeterministic State Monad With Failure

$s \Rightarrow ((\alpha \times s) \ \texttt{set} \times \texttt{bool})$

Where computations return a set of possible results.

C Verification


The syntax apply (wp <extra-wp-rules>) is a tactic for automatic application of weakest precondition rules. When used with AutoCorres, it allows for automated reasoning about C programs.


AutoCorres reduces the pain in reasoning about C code. It is self-certifying, in that it produces Isabelle theorems proving its own correctness.

Word Abstraction

Word abstraction converts a C integer into an Isabelle integer, and a C unsigned into an Isabelle natural number. Guards are inserted to ensure absence of unsigned underflow and overflow.


General Schema

lemma name: "<goal>"
  apply <method>
  apply <method>

Sequential application of methods until all subgoals are solved.


theory MyTh
import ImpThy1 ... ImpThyn


Intro Rules

Decompose formulae to the right of $\implies$.

Elimination Rules

Decompose formulae on the left of $\implies$.


Instantiating Rules

Renaming Parameters

Forward Proofs