Written by Luka Kerr on November 23, 2019
Dafny
Dafny is a Microsoft research project that acts as a static verifier. It translates to Boogie, the backend, which interacts with Z3, a SMT solver.
method Main() {
assert 1 == 1;
print "hello, world\n";
}
Correctness
Dafny can prove your code is correct, meaning it can:
- Guarantee array indices stay in bounds
- Guarantee no null dereferences
- Guarantee no division by zero
- Guarantee the program conforms to a specification
Types
Dafny has various inbuild types:
bool
for booleanschar
for charactersreal
,int
,nat
for real, unbounded and natural numbersset<T>
,seq<T>
for immutable finite sets and sequencesarray<T>
for arraysarray2<T>
for 2D arraysobject
is a super-type of all reference types
Methods
Methods in Dafny are like functions or procedures in conventional languages. Input parameters and output arguments are typed as follows:
method Add(x: int, y: int) returns (sum: int)
{
sum := x + y;
}
Input parameters are immutable. A return
statement is not necessary in a method, but if there is one, it need not have arguments or it must have all the return arguments of the method.
Functions
Functions in Dafny are only used in verification. They are side effect free and consist of a single expression that yields a value of some type.
function One(): int
{
1
}
Function Methods
Function methods combine functions and methods, allowing functions to be used in verification and implementation.
function method One(): int
{
1
}
Predicates
A predicate is used for verification only. It is a special type of function in that it returns a bool
only.
predicate Sorted(a: array<int>)
reads a
{
forall j, k :: 0 <= j < k < a.Length ==> a[j] <= a[k]
}
Main()
The main method takes no input parameters and is the entry point of the program.
Constructs
if ... { ... } else { ... }
for conditionals//
or/* ... */
for comments:=
for assignment==
,<=
,>=
,!=
,<
,>
for boolean operations
Operations
Dafny has various built in operations, including:
assert
that evaluates a boolean expression during verificationrequires
that declares a pre-condition by evaluating a boolean expression during verificationensures
that declares a post-condition by evaluating a boolean expression during verification
method Square(x: int) returns (y: int)
requires x >= 0
ensures y == x*x
{
y := x*x;
assert y > x;
}
Arrays
In Dafny, arrays are mutable data structures and are stored on the heap.
The length of an array a
can be accessed with a.Length
, and array indexes range from 0
to a.Length - 1
.
Arrays are created with the new
keyword: var a: array<char> := new char[5];
.
modifies
To minimise the verifier’s work, a modifies
keyword can be used to tell the verifier that an array will be modified.
reads
If a function has an array input argument, then a reads
statement is required.
Sets
A set set<T>
is an orderless collection of elements that is immutable.
There are various operations that can be performed on sets
var s1: set<int> := {4, 5, 6};
var s2: set<int> := {1, 2, 3};
s1 != s2 // not equal to comparison
s1 == s2 // equal to comparison
s1 + s2 // set union
s1 * s2 // set intersection
s1 - s2 // set difference
s1 <= s2 // s1 subset of s2
s1 < s2 // s2 proper subset of s2
5 in s1 // in membership
5 !in s2 // not in membership
|s1| // size of s1
Sequences
A sequence seq<T>
is an ordered set of elements that is immutable.
There are various operations that can be performed on sequences
var s1: seq<int> := [4, 5, 6];
var s2: seq<int> := [1, 2, 3];
s1 != s2 // not equal to comparison
s1 == s2 // equal to comparison
s1 + s2 // sequence union
s1[0..n] // slice of s1 from 0 to n
5 in s1 // in membership
5 !in s2 // not in membership
|s1| // size of s1
s1[i := v] // update s1 with s1[i] := v
Quantifiers
forall
The quantifier forall
is generally used on finite data structures such as arrays.
The syntax is forall k :: 0 <= k < a.Length ==> a[k] != 7
. It is comparable to $\forall$ in logic.
exists
The quantifier exists
is similar to forall
.
The syntax is exists k :: 0 <= k < a.Length && a[k] != 7
. It is comparable to $\exists$ in logic.
Invariants
The Hoare triple $\{ P \} \ LOOP \ \{ Q \}$ represents a loop. An invariant $I$ is a predicate expression that defines what does not change in a loop.
assert P; // P is true on entry
while B
invariant I
{
s1;
s2;
...;
}
assert Q; // Q is true on exit
We see that
- On entry to the loop $P \implies I$
- Inside the loop $\{ I \land B \} \ s_1; s_2; \dots \ \{ I \}$
- On exit from the loop $I \land \neg B \implies Q$
old()
Arrays are mutable and may change in a method. In a post-condition or assert, you often want to specify this change to the array.
The method old()
takes an expression and returns the value that expression had on entry to the method. This can be used to check whether values have changed.
fresh()
Dafny has the notion of a frame. A method’s frame is the external data that the method is allowed to modify. A function’s frame is the external data that the function is allowed to read.
A fresh()
clause in the postcondition tells calling methods that an object is new (and hence can be trusted).
assume
The assume
statement allows for a logical proposition to be specified that Dafny may assume to be true without proof. If in fact the proposition is not true this may lead to invalid conclusions.
Using an assume statement disables compilation. Assuming true something that Dafny knows is true, does not affect verification, but assuming true something that Dafny knows is false, makes verification meaningless.
Termination
Dafny can prove a program will terminate. To prove terminate, Dafny needs a variant, which lets Dafny prove progress and boundedness. The variant must decrease towards a bound.
decreases
A variant is given by the keyword decreases
.
while i < n
decreases n - i
invariant 0 <= i <= n
{
...;
i := i + 1;
}
function func(i: int): int
decreases i
{
...;
func(i - 1);
}
Classes
Classes are needed to define complex models. They define dynamically allocatable, hidden, mutable data structures.
Attributes
Dafny offers many class/method attributes. Many are directives to the underlying theorem prover, some affect the performance of Dafny directly.
:autocontracts
- Fills much of the dynamic-frames boilerplate into a class
- Requires a predicate
Valid()
- Declares a (dynamic frame) set of objects called
Repr
- Automatically inserts
requires Valid(); ensures Valid();
- Automatically inserts required
reads
andmodifies
clauses
:fuel
- Specifies how much “fuel” a function should have, i.e. how many times Z3 is permitted to unfold it’s definition
:timelimit
and:timeLimitMultiplier
- The verification time limit
Methods
Classes have both static and instance methods.
A static method
- Can be called without creating an instance of the class
- Is declared using the keyword
static
- Is invoked by using a class reference
- Will exist as a single copy for its class
- Cannot access instance methods and instance variables directly
An instance method
- Requires an object to be created before it can be called
- Is invoked by using an object reference
- Can access static variables and static methods directly
Constructors
Class constructors are optional in Dafny.
class SomeClass {
constructor()
{
...
}
}
Lemmas
Sometimes there are steps of logic required to prove a program correct, but they are too complex for Dafny to discover and use on its own. When this happens, we can often give Dafny assistance by providing a lemma. This is done by declaring a method with the lemma
keyword. Lemmas are implicitly ghost methods and the ghost keyword cannot be applied to them.
The ensures clause is used to express a property. Lemmas cannot change state but can use local variables, and cannot have side effects or return parameters.
The general form of a lemma is
lemma SomeName(x: T)
requires P(x)
ensures Q(x)
{
body
}
If the lemma verifies then we must have forall x: T :: P(x) ==> Q(x)
.
Ghosts
Ghosts are entities that are used in verification only.
- Functions and predicates are ghosts by default
- Lemmas are ghost methods
- Variabls can be ghosts
Invariants
Classes can have class invariants, which are properties that are maintained throughout the lifecycle of the class.
Conventionally a class invariant is named Valid()
where
- Every method must maintain
Valid()
- It appears in the precondition and postcondition of every method
- A constructor must establish
Valid()
- It appears in the postcondition of the constructor
Frames
When using the reads
and modifies
keywords, we are determining what data is ‘visible’, i.e. readable and writeable. These visible variable are referred to as a frame.
The reads
/modifies
clauses must be dynamic, that is they must refer to the same data (objects) that are created at runtime.
Dynamic Frames
Dafny uses a idiom that uses sets to make reads
/modifies
anonymous. A dynamic frame is a ghost variable commonly called Repr
, and has the type set<object>
.
Dynamic frames are used in reads
/modifies
clauses to declare objects that can be read/modified, but without naming them specifically.
Inductive Datatypes
Simple datatypes like integers are limited. They do not provide types for complex recursively-defined, data structures, and don’t give users the ability create their own datatype.
An inductive datatype is a type whose values are created using type constructors.
datatype Colour = Red | White | Blue
datatype Tree = Empty | Node(int, Tree, Tree)
Generic inductive datatypes include a type parameter as well.
datatype Tree<T> = Empty | Node(T, Tree<T>, Tree<T>)
Parameters can also be named.
datatype Tree = Empty | Node(value: int, left: Tree, right: Tree)
To access the elements (constructors) of an inductive datatypes, you need to use a match
statement. A match statement evaluates an expression whose type is an inductive datatype, and executes the case corresponding to the constructor that was used to create that
expression.
datatype Tree = Leaf | Node(int, Tree, Tree)
predicate Search(t: Tree, key: int)
{
match t
{
case Leaf => false
case Node(v, l, r) => (v == key || Search(l, key) || Search(r, key))
}
}
High Order Data Structures
There are two kinds of verification of high order data structures
- Program verification on the data structure code
- Treats the data structure operations as program code
- Verify operations using pre and post-conditions
Valid()
defines ‘correct’ actual data
- Ghost verification on a ghost data structure, typically called
shadow
- A ghost data structure ‘shadows’ the implementation data structure
- Verify operations using pre and post-conditions on the
shadow
- As before,
Valid()
defines ‘correct’ actual data
Executable forall
An executable forall
is comparable to a for loop in other programming languages. The general syntax is
forall x: T | R(x, y) ensures P(x, y) {
body
}
where x
is bound here, and y
is not. It can also have more bound variables
forall x: T, y: U | R(x, y, z) ensures P(x, y, z) {
body
}
An executable forall
is actually a generalisation of universal quantification. The body is used to prove the post-condition P(x)
and it must be finite.
Algorithms
Search()
Using Linear Search
method Search(a: array<int>, key: int) returns (i: int)
ensures 0 <= i ==> i < a.Length && a[i] == key
ensures i < 0 ==> forall k :: 0 <= k < a.Length ==> a[k] != key
{
i := 0;
while i < a.Length
invariant 0 <= i <= a.Length
invariant forall k :: 0 <= k < i ==> a[k] != key
{
if a[i] == key {
return;
}
i := i + 1;
}
i := -1;
}
Search()
Using Binary Search
method Search(a: array<int>, key: int) returns (i: int)
requires a != null && Sorted(a)
ensures 0 <= i ==> i < a.Length && a[i] == key
ensures i < 0 ==> forall k :: 0 <= k < a.Length ==> a[k] != key
{
var low, high := 0, a.Length;
while low < high
invariant 0 <= low <= high <= a.Length
invariant forall k :: 0 <= k < a.Length && !(low <= k < high) ==> a[k] != key
{
var mid := (low+high)/2;
if key > a[mid] {
low := mid + 1;
} else if key < a[mid] {
high := mid;
} else {
return mid;
}
}
return -1;
}
Sort()
Using Insertion Sort
method Sort(a: array<int>)
requires a != null
requires a.Length > 1
ensures Sorted(a, 0, a.Length)
ensures multiset(a[..]) == multiset(old(a[..]))
modifies a
{
var up := 1;
while up < a.Length
invariant 1 <= up <= a.Length
invariant Sorted(a, 0, up)
invariant multiset(a[..]) == multiset(old(a[..]))
{
var down := up;
while down >= 1 && a[down-1] > a[down]
invariant 0 <= down <= up
invariant forall i, j :: 0 <= i < j <= up && j != down ==> a[i] <= a[j]
invariant multiset(a[..]) == multiset(old(a[..]))
{
a[down-1], a[down] := a[down], a[down-1];
down := down - 1;
}
up := up + 1;
}
}
Sort()
Using Quicksort
method Sort(a: array<int>, low: int, high: int)
requires a.Length >= 1
requires 0 <= low <= high <= a.Length
requires Sandwich(a, low, high)
ensures Sandwich(a, low, high)
ensures Sorted(a, low, high)
decreases high - low
modifies a
{
if high-low > 1 {
var pivot := partition(a, low, high);
Sort(a, low, pivot);
Sort(a, pivot+1, high);
}
}
predicate Sorted(a: array<int>, lo: int, hi: int)
requires 0 <= lo <= hi <= a.Length
reads a
{
forall j, k :: lo <= j < k < hi ==> a[j] <= a[k]
}
predicate Sandwich(a: array<int>, lo: int, hi: int)
reads a
{
(0 < lo <= hi <= a.Length ==> forall j :: lo <= j < hi ==> a[lo-1] <= a[j]) &&
(0 <= lo <= hi < a.Length ==> forall j :: lo <= j < hi ==> a[j] < a[hi])
}
predicate PivotOrder(a: array<int>, lo: int, pivi: int, hi: int)
requires 0 <= lo <= pivi < hi <= a.Length
reads a
{
(forall j :: lo <= j < pivi ==> a[j] < a[pivi]) &&
(forall j :: pivi < j < hi ==> a[pivi] <= a[j])
}
method Partition(a: array<int>, low: int, high: int) returns (pivi: int)
requires a.Length > 0
requires 0 <= low < high <= a.Length
requires Sandwich(a, low, high)
ensures Sandwich(a, low, high)
ensures 0 <= low <= pivi < high <= a.Length
ensures PivotOrder(a, low, pivi, high)
modifies a
{
pivi := low;
var i := low + 1;
while i < high
invariant low <= pivi < i <= high
invariant PivotOrder(a, low, pivi, i)
invariant Sandwich(a, low, high)
{
if a[i] < a[pivi] {
var stor := a[i];
a[i] := a[i-1];
var back := i - 1;
while back > pivi
invariant a[pivi] > stor
invariant PivotOrder(a, low, pivi, i+1)
invariant Sandwich(a, low, high)
{
a[back+1] := a[back];
back := back - 1;
}
a[pivi+1] := a[pivi];
a[pivi] := stor;
pivi := pivi + 1;
}
i := i + 1;
}
}
Partition()
method Partition(a: array<int>) returns (pi: int)
requires a.Length > 1
ensures 0 <= i < a.Length
ensures Lower(a, pi, pi)
ensures Upper(a, pi, pi)
ensures multiset(a[..]) == multiset(old(a[..]))
modifies a
{
var last := a.Length - 1;
var i := 0;
var j := last;
while i < j
invariant 0 <= i <= j <= last
invariant Lower(a, i, last) && Upper(a, j, last)
invariant multiset(a[..]) == multiset(old(a[..]))
{
while i < j && a[i] < a[last]
invariant 0 <= i <= j
invariant Lower(a, i, last)
{
i := i + 1;
}
while i < j && a[j] >= a[last]
invariant i <= j <= last
invariant Upper(a, j, last)
{
j := j - 1;
}
assert i <= j;
if i < j {
a[i], a[j] := a[j], a[i];
i := i + 1;
}
}
assert i == j;
a[i], a[last] := a[last], a[i];
return i;
}
predicate Lower(a: array<int>, i: int, pi: int)
reads a
requires 0 <= i < a.Length && 0 <= pi < a.Length
{
forall k :: 0 <= k < i ==> a[k] < a[pi]
}
predicate Upper(a: array<int>, j: int, pi: int)
reads a
requires 0 <= j < a.Length && 0 <= pi < a.Length
{
forall k :: j < k < a.Length - 1 ==> a[k] >= a[pi]
}
Software Engineering
The term software engineering was first used in a 1968 NATO conference to indicate a systematic, disciplined, quantifiable approach to the production and maintenance of software.
Problems With Software
There are 6 problems with software
- Complexity
- Conformity
- Flexibility
- Reliability
- Abstraction
- Adaptability
Complexity
Software is complex because
- Irregularity
- Infinite state space
- Discrete state space
- No physical laws of abstraction
- Very difficult to abstract behaviour of software as it is conceptual
Conformity
In a world of people, computers and machines, software must conform. The environment is perceived to be fixed, software is supposed to be infinitely flexible. In a changed, new environment it can be difficult to determine what part of the software needs to change.
Flexibility
The perception of “being easily changed” leads to
- Frequent request to add new features
- Frequent request to support new hardware
Requests can come during or after development.
Reliability
Software is considered ‘naturally’ bug-prone, unlike hardware.
The Pentium Bug
In 1995, a hardware bug in Intel Pentium processors was discovered. It affected the floating point unit (FPU) of the processor, and resulted in an incorrect result from dividing floating point numbers being returned occasionally.
Abstraction
You cannot see software, you see only its effects. It is impossible to completely understand large software systems, they are too complex.
Software is abstract and hard to visualise, the best you can do is use a flow chart, use case diagram or a scenario which are very narrow views of behaviour.
Adaptability
Continuing advances in hardware place increasing demands on software
Arianespace
Arianespace is the world’s leading satellite launch company, founded in 1980. It has launched over 600 satellites to date.
Ariane 5
In 1996, the Ariane 5 rocket shutdown its own guidance system 36.7 seconds into its launch.
The software in the rocket used a mix of 64 and 16 bit data. For integer data, conversion required that the input value be less than 32768, otherwise an exception would be generated.
The series of events that lead to the half a billion dollar crash are
- A computer in the guidance system generated large integer data that could not be converted to 16 bits, and halted
- A backup computer was activated but it ran the same software, had the same problem, and of course also halted
- The guidance system sent a diagnostic message and shut down
- The main computer expected to receive data, and interpreted the message text to mean the rocket was completely off course
- The main computer applied boosters to radically change direction
- The resulting forces caused the rocket to break up
- Observing this, the control tower sent a signal to the rocket to self destruct
Testing
Black Box Testing
- Tests the code against the requirements
- Tests the behaviour of the code
- Must always be done
White Box Testing
- Makes sure the code is correct (i.e. debugging, verifying)
- Should always be done
- Is never finished
- A hit and miss technique in searchnig for faults
- Very time consuming
- Inefficient, as changes to the code possible break all previous checks
At a function level, white box testing can have an infinite number of test cases.
At a statement level, there are common testcase strategies that exercise different paths through the code:
- Statement coverage
- Each statement is executed at least once
- Branch coverage (or decision coverage)
- Each branch in flow-of-control is executed at least once
- Predicate coverage (or condition coverage)
- Every truth value of every predicate is tested
In reality, statement coverage is the industry standard. All other forms of coverage are too tedious to implement.
White box testing does have some limitations, specifically:
- Correctness is defined by the output of testcases
- Individual input values must be well chosen
- It is data driven, the code is not really checked
- It is tedious
Validation & Verification
Validation is the process of checking whether the software is consistent with the requirements.
Verification is the process of checking whether the software is correct with respect to a specification.
Validation
- Always involves execution because it is about the product
- Test the product is doing the ‘right thing’
- Subjective at it deals with (human formulated) requirements
- A high level exercise
- Generally done by a dedicated test team
Black box and white box testing are examples of validation.
Verification
- Can verify a document, a design, code
- Does not involve execution of code
- Considered low level
- An objective, clinical process
- Generally done by a QA team
Code review, design analysis, static code analysis and proof of correctness are examples of verification.
Requirements
The most important phases in software development are
- Requirements elicitation
- Getting the right product
- Verification and testing
- Getting the code right
Requirements elicitation may be more important than testing because if the product is wrong, then so is the code.
The requirements heirarchy is structured so that the highest level requirements (business needs) are the level under the root, and the lowest level requirements (programming tasks) are the leaves. The levels inbetween convert the “what” into the “how”.
Non-Functional Requirements
Non-functional requirements include performance, reliability and maintainability. They specify the operational characteristics of the system.
Generally non-functional requirements are difficult to measure or test. A good design will allow components that do not satisfy the desired operational characteristics be replaced by components that do, without changing the behaviour.
Prioritisation
Adding priorities is very important. It means
- You think about the available time
- You think about layering and scoping
- You think about versioning
Triage
Requirements prioritisation is often referred to as triage. It involves deciding which requirements really matter and those that can wait.
Priorities are denoted using MoSCoW:
- Must have requirements (Priority 1)
- Critical; without which the project will fail
- Should have requirements (Priority 2)
- Important; implement if there is time, or wait until new version
- Could have requirements (Priority 3)
- Desirable; worth doing if there is time
- Won’t have (but would be nice) requirements (Priority 4)
- Unlikely to be included but may be in a future release
- A chance to show the direction the system could grow
Priority Criticality & Consistency
To determine what’s critical you need to talk to the client. There is a trade off between cost of implementation and benefit of the requirement to the business.
The priority of a requirement cannot be lower than the priority of every descendent.
If a requirement depends on another, then that other requirement cannot have a lower priority. Ideally requirements that are dependent should be placed into a single heirarchy.
Requirement Ordering
Requirements can be order by priority or by stakeholder order. Priority order is the most natural, but stakeholder order may be cleaner if the system involves stakeholders that have quite different functionality.
Use Cases
A use case is a written description of how users will perform tasks using your system. It outlines, from a user’s point of view, a system’s behavior as it responds to a request. Each use case is represented as a sequence of simple steps, beginning with a user’s goal and ending when that goal is fulfilled.
Use cases include
- Who is using the system
- What the user want to do
- The user’s goal
- The steps the user takes to accomplish a particular task
- How the system should respond to an action
Elements
Depending on how much depth and complexity is required, a use case describes a combination of
- Actor(s): anyone or anything that performs a behavior using the system
- Stakeholder: someone or something with vested interests in the behavior of the system
- Primary Actor: stakeholder who initiates an interaction with the system to achieve a goal
- Preconditions: what must be true or happen before the use case runs
- Postconditions: what must be true or happen after the use case runs
- Triggers: the event that causes the use case to be initiated
- Basic Flow: use case in which nothing goes wrong
- Alternative Flow(s): variations of the basic flow
Work Breakdown Structure
A Work Breakdown Structure (WBS) is a hierarchical decomposition of the work to be executed by the project team to
- Accomplish the project objectives
- Create the required deliverables
- Identify the work packages
- Show all the deliverables
- Track costs, progress and performance
- Defines responsibilities
- Identifies where coordination is required
It is a project management technique to define and organise the total scope of a project.
It uses a hierarchical tree structure. The first level of the WBS defines a set of planned outcomes and should collectively represent 100% of the project scope. At each subsequent level, the children of a node collectively and exclusively represent 100% of the scope of their parent node. The lowest level breaks down to the task level, where a task is usually less than a days work.
Every WBS element consists of an element which contains
- Scope of work, including any deliverables
- Budget
- Responsible person’s name
A well designed WBS should be more focused on planned outcomes than planned actions.
Level 1 of the WBS is the most important level. It usually determines the projects costs and schedule, and deliverables are normally outputs of the packages.
Types
A WBS may be one or a mixture of any of the following main types
- Deliverable oriented WBS
- Defines project work in terms of (the components that make up) deliverables
- Labels are objects/artifacts
- Phased WBS
- Breaks the project into major phases
- End of a phase usually indicated by a deliverable
- Organizational WBS
- Breaks the work into departments/areas of responsibility
- Task oriented WBS
- Breaks the project into the tasks/actions required to produce each deliverable
- Labels are actions
Project Schedule
A schedule is created from the WBS to ensure the project’s objective is accomplished in the time allotted. To create a schedule
- Determine the costs of all resources each activity requires
- Next, determine the critical path (the shortest period in which the project can be completed)
- Then, assign start and end dates to each work package
Critical Path Method
The Critical Path Method (CPM) is the sequence of scheduled activities that determines the duration of the project. It is the longest sequence of tasks that must be completed on time to meet the project deadline.
If there is a delay in any task on the critical path, then the whole project will be delayed. Most projects have only one critical path, some have multiple critical paths.
The CPM constructs a model of the project that includes
- A list of all activities required to complete the project (taken from the WBS)
- The time that each activity will take to complete
- The dependencies between the activities
The CPM calculates the
- Longest duration path of activities to completion
- Earliest and latest that each activity can start and finish without making the project longer
- ‘Critical’ activies (activities on the longest path)