Written by Luka Kerr on November 23, 2019


Dafny is a Microsoft research project that acts as a static verifier. It translates to Boogie, the backend, which interacts with Z3, a SMT solver.

method Main() {
  assert 1 == 1;
  print "hello, world\n";


Dafny can prove your code is correct, meaning it can:


Dafny has various inbuild types:


Methods in Dafny are like functions or procedures in conventional languages. Input parameters and output arguments are typed as follows:

method Add(x: int, y: int) returns (sum: int)
  sum := x + y;

Input parameters are immutable. A return statement is not necessary in a method, but if there is one, it need not have arguments or it must have all the return arguments of the method.


Functions in Dafny are only used in verification. They are side effect free and consist of a single expression that yields a value of some type.

function One(): int

Function Methods

Function methods combine functions and methods, allowing functions to be used in verification and implementation.

function method One(): int


A predicate is used for verification only. It is a special type of function in that it returns a bool only.

predicate Sorted(a: array<int>)
reads a
  forall j, k :: 0 <= j < k < a.Length ==> a[j] <= a[k]


The main method takes no input parameters and is the entry point of the program.



Dafny has various built in operations, including:

method Square(x: int) returns (y: int)
requires x >= 0
ensures y == x*x
  y := x*x;
  assert y > x;


In Dafny, arrays are mutable data structures and are stored on the heap. The length of an array a can be accessed with a.Length, and array indexes range from 0 to a.Length - 1.

Arrays are created with the new keyword: var a: array<char> := new char[5];.


To minimise the verifier’s work, a modifies keyword can be used to tell the verifier that an array will be modified.


If a function has an array input argument, then a reads statement is required.


A set set<T> is an orderless collection of elements that is immutable.

There are various operations that can be performed on sets

var s1: set<int> := {4, 5, 6};
var s2: set<int> := {1, 2, 3};

s1 != s2   // not equal to comparison
s1 == s2   // equal to comparison
s1 + s2    // set union
s1 * s2    // set intersection
s1 - s2    // set difference
s1 <= s2   // s1 subset of s2
s1 < s2    // s2 proper subset of s2
5 in s1    // in membership
5 !in s2   // not in membership
|s1|       // size of s1


A sequence seq<T> is an ordered set of elements that is immutable.

There are various operations that can be performed on sequences

var s1: seq<int> := [4, 5, 6];
var s2: seq<int> := [1, 2, 3];

s1 != s2    // not equal to comparison
s1 == s2    // equal to comparison
s1 + s2     // sequence union
s1[0..n]    // slice of s1 from 0 to n
5 in s1     // in membership
5 !in s2    // not in membership
|s1|        // size of s1
s1[i := v]  // update s1 with s1[i] := v



The quantifier forall is generally used on finite data structures such as arrays.

The syntax is forall k :: 0 <= k < a.Length ==> a[k] != 7. It is comparable to $\forall$ in logic.


The quantifier exists is similar to forall.

The syntax is exists k :: 0 <= k < a.Length && a[k] != 7. It is comparable to $\exists$ in logic.


The Hoare triple $\{ P \} \ LOOP \ \{ Q \}$ represents a loop. An invariant $I$ is a predicate expression that defines what does not change in a loop.

assert P;    // P is true on entry
while B
invariant I
assert Q;    // Q is true on exit

We see that


Arrays are mutable and may change in a method. In a post-condition or assert, you often want to specify this change to the array.

The method old() takes an expression and returns the value that expression had on entry to the method. This can be used to check whether values have changed.


Dafny has the notion of a frame. A method’s frame is the external data that the method is allowed to modify. A function’s frame is the external data that the function is allowed to read.

A fresh() clause in the postcondition tells calling methods that an object is new (and hence can be trusted).


The assume statement allows for a logical proposition to be specified that Dafny may assume to be true without proof. If in fact the proposition is not true this may lead to invalid conclusions.

Using an assume statement disables compilation. Assuming true something that Dafny knows is true, does not affect verification, but assuming true something that Dafny knows is false, makes verification meaningless.


Dafny can prove a program will terminate. To prove terminate, Dafny needs a variant, which lets Dafny prove progress and boundedness. The variant must decrease towards a bound.


A variant is given by the keyword decreases.

while i < n
decreases n - i
invariant 0 <= i <= n
  i := i + 1;
function func(i: int): int
decreases i
  func(i - 1);


Classes are needed to define complex models. They define dynamically allocatable, hidden, mutable data structures.


Dafny offers many class/method attributes. Many are directives to the underlying theorem prover, some affect the performance of Dafny directly.


Classes have both static and instance methods.

A static method

An instance method


Class constructors are optional in Dafny.

class SomeClass {


Sometimes there are steps of logic required to prove a program correct, but they are too complex for Dafny to discover and use on its own. When this happens, we can often give Dafny assistance by providing a lemma. This is done by declaring a method with the lemma keyword. Lemmas are implicitly ghost methods and the ghost keyword cannot be applied to them.

The ensures clause is used to express a property. Lemmas cannot change state but can use local variables, and cannot have side effects or return parameters.

The general form of a lemma is

lemma SomeName(x: T)
requires P(x)
ensures Q(x)

If the lemma verifies then we must have forall x: T :: P(x) ==> Q(x).


Ghosts are entities that are used in verification only.


Classes can have class invariants, which are properties that are maintained throughout the lifecycle of the class.

Conventionally a class invariant is named Valid() where


When using the reads and modifies keywords, we are determining what data is ‘visible’, i.e. readable and writeable. These visible variable are referred to as a frame.

The reads/modifies clauses must be dynamic, that is they must refer to the same data (objects) that are created at runtime.

Dynamic Frames

Dafny uses a idiom that uses sets to make reads/modifies anonymous. A dynamic frame is a ghost variable commonly called Repr, and has the type set<object>.

Dynamic frames are used in reads/modifies clauses to declare objects that can be read/modified, but without naming them specifically.

Inductive Datatypes

Simple datatypes like integers are limited. They do not provide types for complex recursively-defined, data structures, and don’t give users the ability create their own datatype.

An inductive datatype is a type whose values are created using type constructors.

datatype Colour = Red | White | Blue
datatype Tree = Empty | Node(int, Tree, Tree)

Generic inductive datatypes include a type parameter as well.

datatype Tree<T> = Empty | Node(T, Tree<T>, Tree<T>)

Parameters can also be named.

datatype Tree = Empty | Node(value: int, left: Tree, right: Tree)

To access the elements (constructors) of an inductive datatypes, you need to use a match statement. A match statement evaluates an expression whose type is an inductive datatype, and executes the case corresponding to the constructor that was used to create that expression.

datatype Tree = Leaf | Node(int, Tree, Tree)

predicate Search(t: Tree, key: int)
  match t
    case Leaf => false
    case Node(v, l, r) => (v == key || Search(l, key) || Search(r, key))

High Order Data Structures

There are two kinds of verification of high order data structures

Executable forall

An executable forall is comparable to a for loop in other programming languages. The general syntax is

forall x: T | R(x, y) ensures P(x, y) {

where x is bound here, and y is not. It can also have more bound variables

forall x: T, y: U | R(x, y, z) ensures P(x, y, z) {

An executable forall is actually a generalisation of universal quantification. The body is used to prove the post-condition P(x) and it must be finite.


method Search(a: array<int>, key: int) returns (i: int)
ensures 0 <= i ==> i < a.Length && a[i] == key
ensures i < 0 ==> forall k :: 0 <= k < a.Length ==> a[k] != key
  i := 0;
  while i < a.Length
  invariant 0 <= i <= a.Length
  invariant forall k :: 0 <= k < i ==> a[k] != key
    if a[i] == key {

    i := i + 1;
  i := -1;
method Search(a: array<int>, key: int) returns (i: int)
requires a != null && Sorted(a)
ensures 0 <= i ==> i < a.Length && a[i] == key
ensures i < 0 ==> forall k :: 0 <= k < a.Length ==> a[k] != key
  var low, high := 0, a.Length;
  while low < high
  invariant 0 <= low <= high <= a.Length
  invariant forall k :: 0 <= k < a.Length && !(low <= k < high) ==> a[k] != key
    var mid := (low+high)/2;
    if key > a[mid] {
      low := mid + 1;
    } else if key < a[mid] {
      high := mid;
    } else {
      return mid;
  return -1;

Sort() Using Insertion Sort

method Sort(a: array<int>)
requires a != null
requires a.Length > 1
ensures Sorted(a, 0, a.Length)
ensures multiset(a[..]) == multiset(old(a[..]))
modifies a
  var up := 1;
  while up < a.Length
  invariant 1 <= up <= a.Length
  invariant Sorted(a, 0, up)
  invariant multiset(a[..]) == multiset(old(a[..]))
    var down := up;
    while down >= 1 && a[down-1] > a[down]
    invariant 0 <= down <= up
    invariant forall i, j :: 0 <= i < j <= up && j != down ==> a[i] <= a[j]
    invariant multiset(a[..]) == multiset(old(a[..]))
      a[down-1], a[down] := a[down], a[down-1];
      down := down - 1;
    up := up + 1;

Sort() Using Quicksort

method Sort(a: array<int>, low: int, high: int)
requires a.Length >= 1
requires 0 <= low <= high <= a.Length
requires Sandwich(a, low, high)
ensures Sandwich(a, low, high) 
ensures Sorted(a, low, high) 
decreases high - low
modifies a
  if high-low > 1 {
    var pivot := partition(a, low, high);
    Sort(a, low, pivot);
    Sort(a, pivot+1, high);

predicate Sorted(a: array<int>, lo: int, hi: int)
requires 0 <= lo <= hi <= a.Length
reads a
  forall j, k :: lo <= j < k < hi ==> a[j] <= a[k]

predicate Sandwich(a: array<int>, lo: int, hi: int)
reads a
  (0 < lo <= hi <= a.Length ==> forall j :: lo <= j < hi ==> a[lo-1] <= a[j]) &&
  (0 <= lo <= hi < a.Length ==> forall j :: lo <= j < hi ==> a[j] < a[hi])

predicate PivotOrder(a: array<int>, lo: int, pivi: int, hi: int)
requires 0 <= lo <= pivi < hi <= a.Length
reads a
  (forall j :: lo <= j < pivi ==> a[j] < a[pivi]) &&
  (forall j :: pivi < j < hi ==> a[pivi] <= a[j])

method Partition(a: array<int>, low: int, high: int) returns (pivi: int)
requires a.Length > 0
requires 0 <= low < high <= a.Length
requires Sandwich(a, low, high)
ensures Sandwich(a, low, high)
ensures 0 <= low <= pivi < high <= a.Length
ensures PivotOrder(a, low, pivi, high)
modifies a
  pivi := low;
  var i := low + 1;
  while i < high
  invariant low <= pivi < i <= high
  invariant PivotOrder(a, low, pivi, i)
  invariant Sandwich(a, low, high)
    if a[i] < a[pivi] {
      var stor := a[i];
      a[i] := a[i-1];
      var back := i - 1;

      while back > pivi
      invariant a[pivi] > stor
      invariant PivotOrder(a, low, pivi, i+1)
      invariant Sandwich(a, low, high)
        a[back+1] := a[back]; 
        back := back - 1;
      a[pivi+1] := a[pivi];
      a[pivi] := stor;
      pivi := pivi + 1;
    i := i + 1;


method Partition(a: array<int>) returns (pi: int)
requires a.Length > 1
ensures 0 <= i < a.Length
ensures Lower(a, pi, pi)
ensures Upper(a, pi, pi) 
ensures multiset(a[..]) == multiset(old(a[..]))
modifies a
  var last := a.Length - 1;
  var i := 0;
  var j := last;
  while i < j
  invariant 0 <= i <= j <= last
  invariant Lower(a, i, last) && Upper(a, j, last)
  invariant multiset(a[..]) == multiset(old(a[..]))
    while i < j && a[i] < a[last]
    invariant 0 <= i <= j
    invariant Lower(a, i, last)
      i := i + 1;
    while i < j && a[j] >= a[last]
    invariant i <= j <= last
    invariant Upper(a, j, last)
      j := j - 1;

    assert i <= j;
    if i < j {
      a[i], a[j] := a[j], a[i];
      i := i + 1;
  assert i == j;
  a[i], a[last] := a[last], a[i];
  return i;

predicate Lower(a: array<int>, i: int, pi: int)
reads a
requires 0 <= i < a.Length && 0 <= pi < a.Length
  forall k :: 0 <= k < i ==> a[k] < a[pi]

predicate Upper(a: array<int>, j: int, pi: int)
reads a
requires 0 <= j < a.Length && 0 <= pi < a.Length
  forall k :: j < k < a.Length - 1 ==> a[k] >= a[pi]

Software Engineering

The term software engineering was first used in a 1968 NATO conference to indicate a systematic, disciplined, quantifiable approach to the production and maintenance of software.

Problems With Software

There are 6 problems with software

  1. Complexity
  2. Conformity
  3. Flexibility
  4. Reliability
  5. Abstraction
  6. Adaptability


Software is complex because


In a world of people, computers and machines, software must conform. The environment is perceived to be fixed, software is supposed to be infinitely flexible. In a changed, new environment it can be difficult to determine what part of the software needs to change.


The perception of “being easily changed” leads to

Requests can come during or after development.


Software is considered ‘naturally’ bug-prone, unlike hardware.

The Pentium Bug

In 1995, a hardware bug in Intel Pentium processors was discovered. It affected the floating point unit (FPU) of the processor, and resulted in an incorrect result from dividing floating point numbers being returned occasionally.


You cannot see software, you see only its effects. It is impossible to completely understand large software systems, they are too complex.

Software is abstract and hard to visualise, the best you can do is use a flow chart, use case diagram or a scenario which are very narrow views of behaviour.


Continuing advances in hardware place increasing demands on software


Arianespace is the world’s leading satellite launch company, founded in 1980. It has launched over 600 satellites to date.

Ariane 5

In 1996, the Ariane 5 rocket shutdown its own guidance system 36.7 seconds into its launch.

The software in the rocket used a mix of 64 and 16 bit data. For integer data, conversion required that the input value be less than 32768, otherwise an exception would be generated.

The series of events that lead to the half a billion dollar crash are

  1. A computer in the guidance system generated large integer data that could not be converted to 16 bits, and halted
  2. A backup computer was activated but it ran the same software, had the same problem, and of course also halted
  3. The guidance system sent a diagnostic message and shut down
  4. The main computer expected to receive data, and interpreted the message text to mean the rocket was completely off course
  5. The main computer applied boosters to radically change direction
  6. The resulting forces caused the rocket to break up
  7. Observing this, the control tower sent a signal to the rocket to self destruct


Black Box Testing

White Box Testing

At a function level, white box testing can have an infinite number of test cases.

At a statement level, there are common testcase strategies that exercise different paths through the code:

In reality, statement coverage is the industry standard. All other forms of coverage are too tedious to implement.

White box testing does have some limitations, specifically:

Validation & Verification

Validation is the process of checking whether the software is consistent with the requirements.

Verification is the process of checking whether the software is correct with respect to a specification.


Black box and white box testing are examples of validation.


Code review, design analysis, static code analysis and proof of correctness are examples of verification.


The most important phases in software development are

Requirements elicitation may be more important than testing because if the product is wrong, then so is the code.

The requirements heirarchy is structured so that the highest level requirements (business needs) are the level under the root, and the lowest level requirements (programming tasks) are the leaves. The levels inbetween convert the “what” into the “how”.

Non-Functional Requirements

Non-functional requirements include performance, reliability and maintainability. They specify the operational characteristics of the system.

Generally non-functional requirements are difficult to measure or test. A good design will allow components that do not satisfy the desired operational characteristics be replaced by components that do, without changing the behaviour.


Adding priorities is very important. It means


Requirements prioritisation is often referred to as triage. It involves deciding which requirements really matter and those that can wait.

Priorities are denoted using MoSCoW:

Priority Criticality & Consistency

To determine what’s critical you need to talk to the client. There is a trade off between cost of implementation and benefit of the requirement to the business.

The priority of a requirement cannot be lower than the priority of every descendent.

If a requirement depends on another, then that other requirement cannot have a lower priority. Ideally requirements that are dependent should be placed into a single heirarchy.

Requirement Ordering

Requirements can be order by priority or by stakeholder order. Priority order is the most natural, but stakeholder order may be cleaner if the system involves stakeholders that have quite different functionality.

Use Cases

A use case is a written description of how users will perform tasks using your system. It outlines, from a user’s point of view, a system’s behavior as it responds to a request. Each use case is represented as a sequence of simple steps, beginning with a user’s goal and ending when that goal is fulfilled.

Use cases include


Depending on how much depth and complexity is required, a use case describes a combination of

Work Breakdown Structure

A Work Breakdown Structure (WBS) is a hierarchical decomposition of the work to be executed by the project team to

It is a project management technique to define and organise the total scope of a project.

It uses a hierarchical tree structure. The first level of the WBS defines a set of planned outcomes and should collectively represent 100% of the project scope. At each subsequent level, the children of a node collectively and exclusively represent 100% of the scope of their parent node. The lowest level breaks down to the task level, where a task is usually less than a days work.

Every WBS element consists of an element which contains

A well designed WBS should be more focused on planned outcomes than planned actions.

Level 1 of the WBS is the most important level. It usually determines the projects costs and schedule, and deliverables are normally outputs of the packages.


A WBS may be one or a mixture of any of the following main types

Project Schedule

A schedule is created from the WBS to ensure the project’s objective is accomplished in the time allotted. To create a schedule

Critical Path Method

The Critical Path Method (CPM) is the sequence of scheduled activities that determines the duration of the project. It is the longest sequence of tasks that must be completed on time to meet the project deadline.

If there is a delay in any task on the critical path, then the whole project will be delayed. Most projects have only one critical path, some have multiple critical paths.

The CPM constructs a model of the project that includes

The CPM calculates the