Written by Luka Kerr on July 31, 2020

## Introduction

### Concurrency

**Concurrency** is an abstraction for the programmer, allowing programs to be structured as multiple *threads of control*, called *processes*. These processes may communicate in various ways.

Concurrency is **not** parallelism, which is a means to exploit multiprocessing hardware in order to improve performance.

We could consider a sequential program (a process or thread) as a sequence (or total order) of actions:

\[\bullet \to \bullet \to \bullet \to \bullet \to \bullet \to \bullet \to \cdots\]A concurrent program is not a total order but a *partial order*:

This means that there are now multiple possible interleavings of these actions, our program is non-deterministic where the interleaving is left to the execution model.

#### Execution Models

- Multithreaded execution
- Threads share resources (CPU, caches, memory)

- Parallel Multiprocessor execution
- Threads have their own CPU and caches, but share main memory via a bus

- Parallel Distributed execution
- Threads have their own resources but synchronise via the network

#### Synchronisation

All processes/threads need to communicate, regardless of their execution model. There are two main types of synchronisation:

- Shared variables
- Typically on single-computer execution models

- Message passing
- Typically on distributed execution models

## Reasoning & Semantics

### Reasoning

Sequential program reasoning is usually done with a proof calculus like Hoare Logic:

\[\{ \varphi \} \ P \ \{ \psi \}\]This notation means that if the program $P$ starts in a state satisfying the pre-condition $\varphi$ and it terminates, it will end in a state satisfying the post-condition $\psi$.

### Semantics

A **behaviour** is an infinite sequence of states $\Sigma^\omega$.

If we consider a property to be a set of behaviours, then a program $P$ meets a specification property $S$ iff $\llbracket P \rrbracket \subseteq S$.

#### Properties

A linear temporal property is a set of behaviours.

A **safety** property states that something bad does not happen. These are properties that may be violated by a *finite prefix* of a behaviour

A **liveness** property states that something good will happen. These are properties that can always be satisfied *eventually*.

#### Metric For Behaviours

The distance $d(\sigma, \rho) \in \mathbb{R}_{\ge 0}$ between two behaviours $\sigma$ and $\rho$ is defined as

\[d(\sigma, \rho) = 2^{-\sup \{ i \in \mathbb{N} \ \mid \ \sigma|_i = \rho|_i \}}\]where $\sigma|_i$ is the first $i$ states of $\sigma$ and $2^{-\infty} = 0$. Two behaviours are considered to be close if there is a long prefix for which they agree.

#### Topology

A set $S$ of subsets of $U$ is called a **topology** if it contains $\emptyset$ and $U$, and is closed under union and finite intersection. Elements of $U$ are called *open* and complements of open sets are called *closed*.

#### Limits & Boundaries

Consider a sequence of behaviours $\sigma_0 \sigma_1 \sigma_2 \dots$. The behaviour $\sigma_\omega$ is called a limit of this sequence if the sequence converges to $\sigma_\omega$, i.e. for any positive $\epsilon$:

\[\exists n. \forall i \ge n. d(\sigma_i, \sigma_\omega) < \epsilon\]The limit-closure or closure of a set $A$, written $\overline{A}$, is the set of all the limits of sequences in $A$.

A set $A$ is **limit-closed** if $\overline{A} = A$.

A set $A$ is called **dense** if $\overline{A} = \Sigma^\omega$, i.e. the closure is the space of all behaviours.

Safety properties are limit-closed, and liveness properties are dense.

##### Alpern & Schneider’s Theorem

Every property is the intersection of a safety and a liveness property.

\[P = \underbrace{\overline{P}}_{\text{closed}} \ \cap \ \underbrace{\Sigma^\omega \setminus (\overline{P} \setminus P)}_{\text{dense}}\]## Linear Temporal Logic

### Logic

A logic is a formal language designed to express logical reasoning. Like any formal language, logics have a syntax and semantics.

#### Semantics

Semantics are a mathematical representation of the meaning of a piece of syntax. There are many ways of giving a logic semantics, but we will use *models*.

##### Propositional Logic Semantics

A model for propositional logic is a *valuation* $\mathcal{V} \subseteq \mathcal{P}$, a set of “true” atomic propositions. We can extend a valuation over an entire formula, giving us a satisfaction relation:

We read $\mathcal{V} \models \varphi$ as $\mathcal{V}$ “satisfies” $\varphi$.

### LTL

Linear temporal logic (LTL) is a logic designed to describe linear time properties.

#### Syntax

We have normal propositional operators:

- $p \in \mathcal{P}$ is a LTL formula
- If $\varphi, \psi$ are LTL formulae, then $\varphi \land \psi$ is a LTL formula
- If $\varphi$ is a LTL formula, then $\neg \varphi$ is a LTL formula

We also have modal or temporal operators:

- If $\varphi$ is a LTL formula, then $\circ \varphi$ is a LTL formula
- If $\varphi, \psi$ are LTL formulae, then $\varphi \ \mathcal{U} \ \psi$ is a LTL formula

#### Semantics

Let $\sigma_0 \sigma_1 \sigma_2 \sigma_3 \dots$ be a behaviour. Then define notation:

- $\sigma|_0 = \sigma$
- $\sigma|_1 = \sigma_1 \sigma_2 \sigma_3 \dots$
- $\sigma|_{n + 1} = (\sigma|_1)|_n$

The models of LTL are behaviours. For atomic propositions, we just look at the first state. We often identify states with the set of atomic propositions they satisfy.

\[\begin{array}{lcl} \sigma \models p & \Leftrightarrow & p \in \sigma_0 \\ \sigma \models \varphi \land \psi & \Leftrightarrow & \sigma \models \varphi \ \text{and} \ \sigma \models \psi \\ \sigma \models \neg \varphi & \Leftrightarrow & \sigma \not \models \varphi \\ \sigma \models \circ \varphi & \Leftrightarrow & \sigma|_1 \models \varphi \\ \sigma \models \varphi \ \mathcal{U} \ \psi & \Leftrightarrow & \exists i : \sigma|_i \models \psi \land \forall j < i : \sigma|_j \models \varphi \\ \end{array}\]We say $P \models \varphi \Leftrightarrow \forall \sigma \in \llbracket P \rrbracket. \ \sigma \models \varphi$.

#### Derived Operators

The operator $\Diamond \varphi$ (“finally” or “eventually”) says that $\varphi$ will be true at some point.

The operator $\Box \varphi$ (“globally” or “always”) says that $\varphi$ is always true from now on.

## Promela

Promela in brief:

- A kind of weird hybrid of C and Guarded Command Language
- Models consist of multiple processes which may be non-deterministic, and may include guards
- Supports structured control using special
`if`

and`do`

blocks, as well as`goto`

- Variables are either global or process-local, no other scopes exist
- Variables can be of several types:
`bit`

,`byte`

,`int`

and so on, as well as*channels* - Enumerations can be approximated with
`mtype`

keyword - Correctness claims can be expressed in many different ways

SPIN can be used to randomly simulate Promela programs as well as model check them.

### Guards

```
if
:: (n % 2 != 0); n = 1;
:: (n >= 0); n = n - 2;
:: (n % 3 == 0); n = 3;
fi
```

A boolean expression by itself forms a guard. Execution can only progress past a guard if the boolean expression evaluates to true (non-zero).

If the entire system cannot make progress, that is called deadlock. SPIN can detect deadlock in Promela programs.

`mtype`

& Looping

```
mtype = {RED, YELLOW, GREEN};
active proctype TrafficLight() {
mtype state = GREEN;
do
:: (state == GREEN) -> state = YELLOW;
:: (state == YELLOW) -> state = RED;
:: (state == RED) -> state = GREEN;
od
}
```

Non-determinism can be avoided by making guards mutually exclusive. Loops can be exited with `break`

.

## Critical Sections

A **critical section** is a code segment that accesses shared variables and has to be executed as an atomic action. It means that in a group of cooperating processes, at a given point in time, only one process must be executing its critical section.

### Desiderata

We want to ensure two main properties and two secondary ones:

**Mutual Exclusion**: No two processes are in their critical section at the same time**Eventual Entry**(or starvation-freedom): Once it enters its pre-protocol, a process will eventually be able to execute its critical section**Absence of Deadlock**: The system will never reach a state where no actions can be taken from any process**Absence of Unneccessary Delay**: If only*one*process is attempting to enter its critical section, it is not prevented from doing so

Out of the above, Eventual Entry is a liveness property and the rest are safety properties.

### Fairness

The fairness assumption means that if a process *can* always make a move, it will *eventually* be scheduled to make that move.

Let enabled ($\pi$) and taken ($\pi$) be predicates true in a state iff an action $\pi$ is enabled and taken respectively.

Then,

- Weak fairness for action $\pi$ is expressible as $\Box (\Box \text{enabled}(\pi) \Rightarrow \Diamond \text{taken}(\pi))$
- Strong fairness for action $\pi$ is expressible as $\Box (\Box \Diamond \text{enabled}(\pi) \Rightarrow \Diamond \text{taken}(\pi))$

## Transition Diagrams

A transition diagram is a tuple $(L, T, s, t)$ where

- $L$ is a set of locations (program counter values)
- $T$ is a set of transitions
- $s \in L$ is an entry location
- $t \in L$ is an exit location

A transition is written as $\ell_i \xrightarrow{g;f} \ell_j$ where:

- $\ell_i$ and $\ell_j$ are locations
- $g$ is a guard $\Sigma \to \mathbb{B}$
- $f$ is a state update $\Sigma \to \Sigma$

### Floyd Verification

The definition of a Hoare triple for partial correctness is $\{ \varphi \} \ P \ \{ \psi \}$ which states that if the program $P$ successfully executes from a starting state satisfying $\varphi$, the result state will satisfy $\psi$. This is a *safety* property.

Given a transition diagram $(L, T, s, t)$, we can verify partial correctness by

- Associate with each location $\ell \in L$ an
*assertion*$\mathcal{Q}(\ell) : \Sigma \to \mathbb{B}$ - Prove that this assertion network is
*inductive*, that is, for each transition in $T$ $\ell_i \xrightarrow{g;f} \ell_j$ show that $\mathcal{Q}(\ell_i) \land g \Rightarrow \mathcal{Q}(\ell_j) \circ f$ - Show that $p \Rightarrow \mathcal{Q}(s)$ and $\mathcal{Q}(t) \Rightarrow \psi$

### Parallel Composition

Given two processes $P$ and $Q$ with transition diagrams $(L_P, T_P, s_P, t_P)$ and $(L_Q, T_Q, s_Q, t_Q)$ respectively, the **parallel composition** of $P$ and $Q$, written $P\ \|\ Q$ is defined as $(L, T, s, t)$ where:

- $L = L_P \times L_Q$
- $s = s_P s_Q$
- $t = t_P t_Q$
- $p_i q_i \xrightarrow{g;f} p_j q_i \in T$ if $p_i \xrightarrow{g;f} p_j \in T_P$
- $p_i q_i \xrightarrow{g;f} p_i q_j \in T$ if $q_i \xrightarrow{g;f} q_j \in T_Q$

The problem with using the Floyd method is that the number of locations and transitions grows exponentially as the number of processes increases.

### Owicki-Gries Method

To show $\{ \varphi \} \ P\ \|\ Q \ \{ \psi \}$:

- Define local assertion networks $\mathcal{P}$ and $\mathcal{Q}$ for both processes and show that they’re inductive
- For each location $p \in L_P$, show that $\mathcal{P}(p)$ is not falsified by any transition of $Q$, that is, for each $q \xrightarrow{g;f} q^\prime \in T_Q : \mathcal{P}(p) \land \mathcal{Q}(q) \land g \Rightarrow \mathcal{P}(p) \circ f$
- Vice versa for $\mathcal{Q}$
- Show that $\varphi \Rightarrow \mathcal{P}(s_P) \land \mathcal{Q}(s_Q)$ and $\mathcal{P}(t_P) \land \mathcal{Q}(t_Q) \Rightarrow \psi$

The Owicki-Gries method generalises to $n$ processes just by requiring more interference freedom obligations.

#### Derived Assertion Network

The automatic assertion network we get for the parallel composition from the Owicki-Gries method is the conjunction of the local assertions at each of the component states.

### Invariants

If we have an assertion network where every assertion is the same, then we don’t need to prove interference freedom — the local verification conditions already show that the invariant is preserved. This is known as having an **invariant**.

## Critical Section Algorithms

**Linear waiting** is the property that says the number of times a process is “overtaken” (bypassed) in the preprotocol is bounded by $n$ (the number of processes).

### Peterson’s Algorithm

This algorithm is for $n$ processes for a process $i$.

```
integer array in[1..n] = [0, ..., 0]
integer array in[1..n] = [0, ..., 0]
loop forever
p1: non-critical section
for all processes j
p2: in[i] = j
p3: last[j] = i
for all processes k != i
p4: await in[k] < j or last[j] != i
p5: critical section
p6: in[i] = 0
```

### Bakery Algorithm

```
boolean array choosing[1..n] = [false, ..., false]
integer array number[1..n] = [0, ..., 0]
loop forever
p1: non-critical section
p2: choosing[i] = true
p3: number[i] = 1 + max(number)
p4: choosing[i] = false
p5: for all other processes j
p6: await choosing[j] == false
p7: await (number[j] == 0) or (number[i] << number[j])
p8: critical section
p9: number[i] = 0
```

### Fast Algorithm

```
integer gate1 = 0
integer gate2 = 0
boolean wantp = false
boolean wantq = false
p | q
----------------------------------------------------------------
p1: gate1 = p q1: gate1 = q
wantp = true wantq = true
p2: if gate2 != 0 q2: if gate2 != 0
wantp = false wantq = false
goto p1 goto q1
p3: gate2 = p q3: gate2 = q
p4: if gate1 != p q4: if gate1 != q
wantp = false wantq = false
await wantq == false await wantp = false
p5: if gate2 != p q5: if gate2 != q
goto p1 goto q1
else else
wantp = true wantq = true
critical section critical section
p6: gate2 = 0 q6: gate2 = 0
wantp = false wantq = false
```

### Szymanski’s Algorithm

None of the mutual exclusion algorithms presented so far scores full marks. Selected problems:

- Has a $\mathcal{O}(n^2)$ pre-protocol (Peterson)
- Relies on special instructions (
`xc`

,`ts`

, etc) - Uses unbounded ticket numbers (Bakery)
- Sacrifices eventual entry (Fast)

Szymanski’s Algorithm doesn’t have any of these problems, it enforces linear waiting, requires at most $4p - \lceil \frac{p}{n} \rceil$ writes for $p$ CS entries by $n$ competing processes and can be made immune to process failures and restarts as well as read errors occurring during writes.

#### Idea

The prologue is modeled after a waiting room with two doors. […] All processes requesting entry to the CS at roughly the same time gather first in the waiting room. Then, when there are no more processes requesting entry, waiting processes move to the end of the prologue. From there, one by one, they enter their CS. Any other process requesting entry to its CS at that time has to wait in the initial part of the prologue (before the waiting room).

Szymanski, 1988, in ICCS

#### Promela Pseudocode

```
byte flag[N] = 0;
active [N] proctype p() {
do
::
non_critical_section();
flag[_pid] = 1;
max(flag[0..NPROC-1]) < 3;
flag[_pid] = 3;
if
:: exists(flag[0..NPROC-1], '==1') ->
flag[_pid] = 2;
exists(flag[0..NPROC-1], '==4');
:: else ->
skip;
fi;
flag[_pid] = 4;
max(flag[0.._pid - 1]) < 2;
critical_section();
forall (flag[_pid + 1..NPROC-1], '< 2 or > 3');
flag[_pid] = 0;
od
}
```

## Hardware Assisted Critical Sections

In a mulitprocessor execution environment, reads and writes to variables initially only read from/write to cache.

Writes to shared variables must eventually trigger a write-back to main memory over the bus.

These writes cause the shared variable to be cache invalidated. Each processor must now consult main memory when reading in order to get an up-to-date value.

The problem is that bus traffic is limited by hardware.

The solution is the *test-and-test-and-set* instruction.

## Semaphores

A semaphore can be abstractly viewed as a pair $(v, L)$ of a natural number $v$ and a set of processes $L$. The semaphore must always be initialised to some $(v, \emptyset)$.

There are two basic operations a process $p$ could perform on a semaphore $S$:

**wait($S$)**or $P(S)$, decrements $v$ if positive, otherwise adds $p$ to $L$ and blocks $p$**signal($S$)**or $V(S)$, if $L \ne \emptyset$, unblocks a member of $L$, otherwise increments $v$

This type of semaphore is known as a *weak* semaphore.

A *busy-wait* semaphore is when the set $L$ is implicitly the set of (busy-)waiting processes on the integer.

### Strong Semaphores

Replace the set $L$ with a queue, such that processes are woken up in FIFO order. This guarantees linear waiting, but is of course harder to implement and potentially more expensive than our previous weak or busy-wait semaphores.

### Reasoning About Semaphores

For a semaphore $S = (v, L)$ initialised to $(k, \emptyset)$, the following invariants always hold:

- $v = k + \#signal(S) - \#wait(S)$
- $v \ge 0$

where

- $\#signal(S)$ is the number of times
**signal($S$)**has successfully executed - $\#wait(S)$ is the number of times
**wait($S$)**has successfully executed

Here we define successful execution to mean that the process has proceeded to the following statement. So if a process is blocked on a **wait($S$)**, then $\#wait(S)$ will not increase until the process is unblocked.

#### Safety Properties

We observe the invariant that the number of processes in their CS is $\#wait(S) - \#signal(S)$.

##### Mutual Exclusion

We know:

- $v = 1 + \#signal(S) - \#wait(S)$
- $v \ge 0$
- $\#CS = \#wait(S) - \#signal(CS)$

From these invariants it is possible to show that $\#CS \le 1$, i.e. mutual exclusion.

##### Absence Of Deadlock

Assume that deadlock occurs by all processes being blocked on **wait**, so no process can enter its critical section ($\#CS = 0$).

Then $v = 0$, contradicting our semaphore invariants above. So there cannot be deadlock.

#### Liveness Properties

##### Eventual Entry For $p$

To simplify things, we will prove for only two processes, $p$ and $q$.

Assume that $p$ is starved, indefinitely blocked on the **wait**. Therefore $S = (0, L)$ and $p \in L$. We know therefore, substituting into our invariants:

- $0 = 1 + \#signal(S) - \#wait(S)$
- $\#CS = \#wait(S) - \#signal(S)$

From which we can conclude that $\#CS = 1$. Therefore $q$ must be in its critical section and $L = \{ p \}$.

By a progress assumption (that processes that are able to move will do so when asked by the scheduler), we know that eventually $q$ will finish its CS and **signal($S$)**. Thus, $p$ will be unblocked, causing it to gain entry – a contradiction.

### Producer Consumer Problem

A producer process and a consumer process share access to a shared buffer of data. This buffer acts as a queue. The producer adds messages to the queue, and the consumer reads messages from the queue. If there are no messages in the queue, the consumer blocks until there are messages.

```
bounded[N] queue[T] buffer = empty queue
semaphore full = (0, {})
semaphore empty = (N, {})
producer | consumer
----------------------------------------------------------
T d T d
loop forever loop forever
p1: d = produce p1: wait(full)
p2: wait(empty) p2: d = take(buffer)
p3: append(d, buffer) p3: signal(empty)
p4: signal(full) p4: consume(d)
```

### Disadvantages

**Lack of structure**: when building a large system, responsibility is diffused among implementers**Global visibility**: when something goes wrong, the whole program must be inspected, deadlocks are hard to isolate

## Monitors

Monitors solve the problems that semaphores have. They concentrate one responsibility into a single module and encapsulate critical resources. They offer more structure than semaphores and more control than `await`

.

Monitors are a generalisation of **objects** (OOP). They

- May encapsulate some private data – all fields are private
- Expose one or more
*operations*– akin to methods - Have implicit mutual exclusion – each operation invocation is implicitly atomic
- Have explicit signaling and waiting through condition variables

### Condition Variables

Condition variables are named FIFO queues of blocked processes. Processes executing a procedure of a monitor with condition variable $cv$ can:

- Voluntarily suspend themselves using
**waitC($cv$)** - Unblock the first suspended process by calling
**signalC($cv$)**or - Test for emptiness of the queue:
**empty($cv$)**

### Signaling Disciplines

Let

- $S$ be the signaling process
- $W$ mean waiting on a condition variable
- $E$ mean waiting on entry

In Hoare’s paper, $E < S < W$. This is also called the *immediate resumption requirement* or IRR. That is, a signalling process must wait for the signalled process to exit the monitor (or wait on a condition variable) before resuming execution.

In Java and pthreads, and many other implementations, $E = W < S$. This means that signalling processes continue executing, and signalled processes await entry to the monitor at the same priority as everyone else.

### Monitors In Java

An object in Java can be made to approximate a monitor with one waitset (i.e. unfair) condition variable and no immmediate resumption:

- A method is made mutually exclusive using the
`synchronized`

keyword - Synchronized methods of an object may call
`wait()`

to suspend until`notify()`

is called, analogous to condition variables - No immediate resumption requirement means that waiting processes need to re-check their conditions
- No strong fairness guarantee about wait lists, meaning any arbitrary waiting
process is awoken by
`notify()`

### Shared Data

**Problem**: We have a large data structure (i.e. a structure that cannot be updated in one atomic step) that is shared between some number of writers who are updating the data structure and some number of readers who are attempting to retrieve a coherent copy of the data structure.

**Desiderata:**:

- We want
*atomicity*, in that each update happens in one go, and updates-in-progress or partial updates are not observable - We want
*consistency*, in that any reader that starts after an update finishes will see that update - We want to minimise
*waiting*

## Distributed Programming

In distributed programming:

- Processes can be distributed across machines
- Processes usually cannot use shared variables
- Processes share
*communication channels* - Processes access channels by
*message passing*,*remote procedure call*(RPC), or*rendezvous*

### Message Passing

A channel is a typed FIFO queue between processes. We distinguish synchronous from asynchronous channels.

We denote:

- $ch \Leftarrow x$ to mean ‘send a message
`x`

into channel`ch`

’ - $ch \Rightarrow y$ to mean ‘receive a message from
`ch`

and store in`y`

’

#### Synchronous Channels

If a channel is synchronous, the queue has capacity 0. Both the send and the receive operation block until they both are ready to execute. When they are, they proceed at the same time and the value of `x`

is assigned to `y`

.

#### Asynchronous Channels

If a channel is asynchronous, the send operation doesn’t block. It appends the value of `x`

to the FIFO queue associated with the channel `ch`

. Only the receive operation blocks until the channel `ch`

contains a message. When it does, the oldest message is removed and its content is stored in `y`

.

##### Taxonomy Of Asynchronous Message Passing

- Rel: ‘reliable’
- Dup: ‘with duplication’
- FIFO: ‘order-preserving’
- Fair: ‘lossy but fair’

We then have:

- RelFIFO
- FairFIFO
- RelDup
- FairDup
- RelDupFIFO
- FairDupFIFO

#### Producer Consumer Example

```
channel of integer ch
producer | consumer
----------------------------------------------
integer x integer y
loop forever loop forever
p1: x <- produce p1: ch => y
p2: ch <= x p2: consume(y)
```

## Synchronous Message Passing

### Synchronous Transition Diagrams

A synchronous transition diagram is a parallel composition $P_1\ \|\ \dots\ \|\ P_n$ of some (sequential) transition diagrams $P_1, \dots, P_n$ called processes.

The processes $P_i$:

- Do not share variables
- Communicate along unidirectional channels $C, D, \dots$ connecting at most 2 different processes by way of
- output statements $C \Leftarrow e$ for sending the value of expression $e$ along channel $C$
- input statements $C \Rightarrow x$ for receiving a value along channel $C$ into variable $x$

For shared variable concurrency, labels $b; f$ where $b$ is a Boolean condition and $f$ a state transformation sufficed. Now, we call such transitions *internal*.

We extend this notation to message passing by allowing the guard to be combined with an input or an output statement.

#### Closed Product

Given $P_i = (L_i, T_i, s_i, t_i)$ for $1 \le i \le n$ with disjoint local variable sets, define their *closed product* as $P = (L, T, s, t)$ such that $L = L_1 \times \dots \times L_n$, $s = \langle s_1, \dots, s_n \rangle$, $t = \langle t_1, \dots, t_n \rangle$ and $\ell \xrightarrow{a} \ell^\prime \in T$ iff either

- $\ell = \langle \ell_1, \dots, \ell_i, \dots, \ell_n \rangle$, $\ell^\prime = \langle \ell_1, \dots, \ell^\prime_i, \dots, \ell_n \rangle$, $\ell \xrightarrow{a} \ell^\prime_i \in T_i$ an internal transition or
- $\ell = \langle \ell_1, \dots, \ell_i, \dots, \ell_j, \dots, \ell_n \rangle$, $\ell^\prime = \langle \ell_1, \dots, \ell^\prime_i, \dots, \ell^\prime_j, \dots, \ell_n \rangle$, $i \ne j$ with $\ell_i \xrightarrow{b; C \Leftarrow e; f} \ell^\prime_i \in T_i$ and $\ell_j \xrightarrow{b^\prime; C \Rightarrow x; g} \ell^\prime_j \in T_j$ and $a$ is $b \land b^\prime; f \circ g \circ \llbracket x \gets e \rrbracket$

### Verification

To show that $\{ \phi \} P_1\ \|\ \dots\ \|\ P_n \{ \psi \}$ is valid, we could simply prove $\{ \phi \} \ P \ \{ \psi \}$ for $P$ being the closed product of the $P_i$. This can be done using the same method as for ordinary transition diagrams because there are no I/O transitions left in $P$.

As with the standard product construction for shared-variable concurrency, the closed product construction leads to a number of verification conditions exponential in the number of processes. Therefore, we are looking for an equivalent of the Owicki/Gries method for synchronous message passing.

#### A Simplistic Method

For each location $\ell$ in some $L_i$, find a local predicate $Q_\ell$, only depending on $P_i$’s local variables.

- Prove that, for all $i$, the local verification conditions hold, i.e. $\models Q_\ell \land b \to Q_{\ell^\prime} \circ f$ for each $\ell \xrightarrow{b; f} \ell^\prime \in T_i$
- For all $i \ne j$ and matching pairs of I/O transitions $\ell_i \xrightarrow{b; C \Leftarrow e; f} \ell_i^\prime \in T_i$ and $\ell_j \xrightarrow{b^\prime; C \Rightarrow x; g} \ell_j^\prime \in T_j$ show that $\models Q_{\ell_i} \land Q_{\ell_j} \land b \land b^\prime \implies (Q_{\ell_i^\prime} \land Q_{\ell_j^\prime}) \circ f \circ g \circ \llbracket x \gets e \rrbracket$
- Prove $\models \phi \implies Q_{s_1} \land \dots \land Q_{s_n}$ and $\models Q_{t_1} \land \dots \land Q_{t_n} \implies \psi$

The simplistic method is sound but not complete. It generates proof obligations for all syntactically-matching I/O transition pairs, regardless of whether these pairs can actually be matched semantically (in an execution).

##### Remedy 1: Adding Shared Auxiliary Variables

Use shared auxiliary variables to relate locations in processes by expressing that certain combinations will not occur during execution. Only output transitions need to be augmented with assignments to these shared auxiliary variables.

**Pro**: easy.

**Con**: re-introduces interference freedom tests for matching pairs $\ell_i \xrightarrow{b_i; C \Leftarrow e; f_i} \ell_i^\prime \in T_i$ and $\ell_j \xrightarrow{b_j; C \Rightarrow x; f_j} \ell_j^\prime \in T_j$ and location $\ell_m$ of process $P_m$, $m \ne i, j$:

This method is due to Levin & Gries.

##### Remedy 2: Local Auxiliary Variables + Invariant

Use only local auxiliary variables and a global communication invariant $I$ to relate values of local auxiliary variables in the various processes.

**Pro**: no interference freedom tests.

**Con**: more complicated proof obligation for communication steps:

This method is the AFR-method.

## Termination

For programs that do terminate, termination is often the most important liveness property. In addition to the typical cause of non-termination for sequential programs, namely divergence, concurrent programs can also deadlock. Termination = convergence + deadlock freedom.

A program is $\phi$-convergent if it cannot diverge (run forever) when started in an initial state satisfying $\phi$. Instead, it must terminate, or become deadlocked.

To prove convergence, we prove that there is a bound on the remaining computation steps from any state that the program reaches.

### Ordered & Wellfounded Sets

In maths, this bound condition is formalised by the concept of a wellfounded set. Recall that, on a set $W$, the binary relation $\prec \subseteq W^2$ is a (strict) partial order, if it is:

- irreflexive ($a \not\prec a$)
- asymmetric ($a \prec b \implies b \not \prec a$)
- transitive ($a \prec b \land b \prec c \implies a \prec c$)

A partially ordered set $(W, \prec)$ is wellfounded if every descending sequence $\langle w_0 \succ w_1 \succ \dots \rangle$ in $(W, \prec)$ is finite.

#### Floyd’s Wellfoundedness Method

Given a transition diagram $P = (L, T, s, t)$ and a precondition $\phi$, we can proce $\phi$-convergence of $P$ by:

- finding an inductive assertion network $Q : L \to (\Sigma \to \mathbb{B})$ and showing that $\models \phi \implies Q_s$
- choosing a wellfounded set $(W, \prec)$ and a network $(\rho_\ell)_{\ell \in L}$ of partially defined ranking functions from $\Sigma$ to $W$ such that:
- $Q_\ell$ implies that $\rho_\ell$ is defined
- every transition $\ell \xrightarrow{b; f} \ell^\prime \in T$ decreases the ranking function, that is: $\models Q_\ell \land b \implies \rho_\ell \succ (\rho_{\ell^\prime} \circ f)$

This method is sound, that is, it indeed establishes $\phi$-convergence.

This method is also semantically complete, that is, if $P$ is $\phi$-convergent, then there exist assertion and ranking function networks satisfying the verification conditions for proving convergence.

##### Simplifying The Method

We can base convergence proofs on ranking functions only. Although this results in a superficially simpler method, applying it is by no means simpler than Floyd’s.

Given a transition diagram $P = (L, T, s, t)$ and a precondition $\phi$, we can price $\phi$-convergence of $P$ by choosing a wellfounded set $(W, \prec)$ and a network $(\rho_\ell)_{\ell \in L}$ of partially defined ranking functions from $\Sigma$ to $W$ such that:

- for all $\sigma \in \Sigma$, if $\sigma \models \phi$ then $\rho_s$ is defined
- every transition $\ell \xrightarrow{b; f} \ell^\prime \in T$ decreases the ranking function, that is, if $\sigma \models b$ and $\rho_\ell$ is defined, then $\rho_{\ell^\prime}(f(\sigma))$ is defined and $\rho_\ell(\sigma) \succ \rho_{\ell^\prime}(f(\sigma))$

### Shared Variables

How can we extend Floyd’s method for proving $\phi$-convergence to shared-variable concurrent programs $P = P_1\ \|\ \dots\ \|\ P_n$?

**Answer** (simplistic): Construct product transition system, use Floyd’s method on that. This leads to the usual problem with exponentially growing numbers of locations, ranking functions, and thus verification conditions.

**Answer** (better): Find a proof principle relating to Floyd’s method as the Owicki/Gries method relates to the inductive assertion method applied to the product transition system.

#### Local Method For Proving $\phi$-convergence

Suppose that for each $P_i = (L_i, T_i, s_i, t_i)$ we’ve found a local assertion network $(Q_\ell)*{\ell \in L_i}$, a wellfounded set $(W_i, \prec_i)$, and a network $(\rho*\ell)_{\ell \in L_i}$ of partial ranking functions.

- Prove that the assertions and ranking functions are locally consistent, i.e. that $\rho_\ell$ is defined whenever $Q_\ell$ is true
- Prove local correctness of every $P_i$, i.e. for $\ell \xrightarrow{b; f} \ell^\prime \in T_i$:
- $\models Q_\ell \land b \implies Q_{\ell^\prime} \circ f$
- $\models Q_\ell \land b \implies \rho_\ell \succ_i (\rho_{\ell^\prime} \circ f)$

- Prove interference freedom for both local networks, i.e. for $\ell \xrightarrow{b; f} \ell^\prime \in T_i$ and $\ell^{\prime \prime} \in L_k$, for $k \ne i$:
- $\models Q_\ell \land Q_{\ell^{\prime \prime}} \land b \implies Q_{\ell^{\prime \prime}} \circ f$
- $\models Q_\ell \land Q_{\ell^{\prime \prime}} \land b \implies \rho_{\ell^{\prime \prime}} \succeq_k (\rho_{\ell^{\prime \prime}} \circ f)$

- Prove $\models \phi \implies \bigwedge_i Q_{s_i}$

The local method is again sound and semantically complete (with auxiliary variables).

#### Convergence With AFR

To prove that a synchronous transition diagram $P = P_1\ \|\ \dots\ \|\ P_n$ (where the $P_i = (L_i, T_i, s_i, t_i)$ with the usual restrictions) is $\phi$-convergent, omit the last point from the AFR method and then choose WFO’s $(W_i, \prec_i)$ and networks $(\rho_\ell)_{\ell \in L_i}$ of local ranking functions only involving $P_i$’s variables and prove that:

- Both networks are locally consistent: for all states $\sigma$, $\sigma \models Q_\ell \implies \rho_\ell(\sigma) \in W_i$
- For all internal $\ell \xrightarrow{b; f} \ell^\prime \in T_i$: $\models Q_\ell \land b \implies \rho_\ell \succ_i (\rho_{\ell^\prime} \circ f)$
- Local ranking functions cooperate, namely, for every matching pair $\ell_1 \xrightarrow{b; C \Leftarrow e; f} \ell_2 \in L_i$ and $\ell_1^\prime \xrightarrow{b^\prime; C \Rightarrow x; f^\prime} \ell_2^\prime \in L_k$, with $i \ne k$ show $\models I \land Q_{\ell_1} \land Q_{\ell_1^\prime} \land b \land b^\prime \implies ((\rho_{\ell_1}, \rho_{\ell_1^\prime}) >
*{cw} (\rho*{\ell_2} \circ g, \rho_{\ell_2^\prime} \circ g))$ where $g = f \circ f^\prime \circ \llbracket x \gets e \rrbracket$

## Deadlock

A non-terminated process is deadlocked if it cannot move anymore. In our setting of transition diagrams, there are two distinct causes for deadlock:

**Message deadlock**: The process blocks on a receive (or synchronous send) statement but no communication partner will ever execute the corresponding send (receive) statement**Resource deadlock**: The process blocks in a state from which only guarded transition depart but none of the guards will ever become true again

### Deadlock Avoidance

#### By Order

A simple resource acquisition policy can be formulated that precludes resource deadlocks by avoiding cycles in wait-for-graphs. This involves assigning a precedence to each resource and forcing processes to request resources in order of increasing precedence.

#### By Resource Scheduling

This involves granting resources only if all resources a process needs can be granted at that time to avoid entering unsafe states in which more than one process holds partial sets of resources. It requires both the number of processes and their resource needs to be static.

### Deadlock For Transtion Diagrams

A transition $\ell \xrightarrow{b; f} \ell^\prime$ is **enabled** in a state $\sigma$ if its boolean condition $b$ is satisfied in $\sigma$.

A process is **blocked** at a location $\ell$ if it has not terminated ($\ell \ne t$) and none of its transitions are enabled there.

A concurrent program is **deadlocked** if some of its processes are blocked and the remaining ones have terminated.

### Characterisation Of Blocking

Let $P = P_1\ \|\ \dots \|\ P_n$, it’s precondition $\phi$, and assume that for each process $P_i = (L_i, T_i, s_i, t_i)$ of $P$ there is a local assertion network $(Q_\ell)_{\ell \in L_i}$ satisfying all but the last condition ($\models \bigwedge_i Q_{t_i} \implies \psi$) of the Owicki/Gries method for proving $\{ \phi \} \ P \ \{ \psi \}$.

Process $P_i$ can only be blocked in state $\sigma$ at non-final location $\ell \in L_i \setminus \{ t_i \}$ from which there are $m$ transitions with boolean conditions $b_1, \dots, b_m$ respectively, if $\sigma \models \text{CanBlock}_\ell$ where $\text{CanBlock}_\ell = Q_\ell \land \neg \bigvee^m_{k = 1} b_k$.

Consequently, using predicates

\[\text{Blocked}_1 = \bigvee_{\ell \in L_i \setminus \{ t_i \}} \text{CanBlock}_\ell\]deadlock can only occur in a state $\sigma$ if

\[\sigma \models \bigwedge_{i = 1}^n (Q_{t_i} \lor \text{Blocked}_i) \land \bigvee_{i = 1}^n \text{Blocked}_i\]holds (every process has terminated or blocked and at least one is blocked).

#### Owicki/Gries Deadlock Freedom Condition

The condition

\[\models \neg (\bigwedge_{i = 1}^n (Q_{t_i} \lor \text{Blocked}_i) \land \bigvee_{i = 1}^n \text{Blocked}_i) \qquad \qquad DFC\]ensures that $P$ will not deadlock when started in a state satisfying $\phi$.

The Owicki/Gries method with the last condition replaced by the deadlock freedom condition is sound and semantically complete for proving deadlock freedom relative to some precondition $\phi$.

### Deadlock Freedom For Synchronous Message Passing

An I/O transition can occur iff the guards of both (matching) transition involved hold. For a global configuration (a pair consisting of a state giving values to all variables and a tuple of local locations, one for each diagram) $\langle \ell; \sigma \rangle$, define

\[\sigma \models \mathbf{live} \, \ell \quad \text{iff} \quad \begin{cases} \mathbf{True} \text{, if all local locations are terminal} \\ \text{a transition is enabled in} \ \langle \ell; \sigma \rangle \text{, otherwise} \end{cases}\]If we can show that every configuration $\langle \ell; \sigma \rangle$ reachable from an initial global state (satisfying $\sigma$ if we happen to have such a precondition) satisfies $\sigma \models \mathbf{live} \, \ell$, then we have verified deadlock freedom.

### Deadlock Freedom With AFR

For $n \in \{ 1 \dots n \}$ let $P_i = (L_i, T_i, s_i, t_i)$ such that the $L_i$ are pairwise disjoint and the processes’ variable sets are pairwise disjoint. To prove that a synchronous transition diagram $P = P_1\ \|\ \dots\ \|\ P_n$ is deadlock free relative to precondition $\phi$:

- Omit the last point from the AFR method
- Verify the deadlock freedom condition for every global label $\langle \ell_1, \dots, \ell_n \rangle \in L_1 \times \dots \times L_n$:

This method generates a verification condition for each global location, i.e. $| L_1 \times \dots \times L_n | = \prod_{i = 1}^n | L_i |$ many.

This method is once again sound and semantically complete (with auxiliary variables).

## Compositionality

A compositional proof method is a method by which the specification of a system can be inferred from the specifications of its constituents, without additional information about their internal structure.

de Roever et al.

### Analysis Of AFR & L&G

- Both are only applicable to closed systems
- That means we always have to reason about the system as a whole, even including users modeled as processes
- Using these methods, one cannot reason
*compositionally*, typically non-compositional proof methods don’t scale and preclude re-use

### Compositionally-Inductive Assertion Network

Handles communication with a special logical variable $h$, containing the *history* of all communication, i.e. a *sequence of pairs* of channels and messages $\langle C, x \rangle$.

A local assertion network $Q$ is *compositionally-inductive* for a sequential synchronous transition diagram $P = (L, T, s, t)$, written $P \vdash Q$, if

- $\models Q_\ell \land b \implies Q_{\ell^\prime} \circ f$ for each $\ell \xrightarrow{b; f} \ell^\prime \in T$
- $\models Q_\ell \land b \implies Q_{\ell^\prime} \circ (f \circ \llbracket h \gets h \cdot \langle C, e \rangle \rrbracket)$ for each $\ell \xrightarrow{b; C \Leftarrow e; f} \ell^\prime \in T$
- $\models Q_\ell \land b \implies \forall x : (Q_{\ell^\prime} \circ (f \circ \llbracket h \gets h \cdot \langle C, x \rangle \rrbracket))$ for each $\ell \xrightarrow{b; C \Rightarrow x; f} \ell^\prime \in T$

#### Partial Correctness

Let $Q$ be an assertion network for a process $P$, and $Q_s$ and $Q_t$ be the assertions at the start and end states.

Then by inductivity we have the **Basic diagram rule**:

We assume the history is empty initially with the **Initialization rule**:

#### Parallel Composition Rule

Provided $\psi_i$ only makes assertions about local variables in $P_i$ and those parts of the history that involve channels read from/written to by $P_i$ we get this **compositional parallel composition rule**:

Observe that we don’t need to prove anything like interference freedom or generate a proof obligation about each possible communication.

We define $h|_H$ as the history $h$ filtered to only contain those pairs $\langle C, x \rangle$ where $C \in H$.

#### Assertions

Let $\hat{\theta}$ denote the history sequence $\theta$ without channel data – i.e. just messages. Then our desired postcondition is:

\[\begin{array}{lcl} \varphi & \equiv & \exists \alpha \beta : h\|_{\{ inX \}} = \alpha \cdot \langle inX, EOF \rangle \land h\|_{\{ inY \}} = \beta \cdot \langle inY, EOF \rangle \\ & & \land \ (\mathbf{sorted}(\hat{\alpha}) \land \mathbf{sorted}(\hat{\beta}) \Rightarrow h\|_{\{ out \}} = \mathbf{sort}(\hat{\alpha} \cdot \hat{\beta})) \\ \mathcal{Q}(q_0) & \equiv & h\|_{\{ inX, inY, out \}} = \epsilon \\ \mathcal{Q}(q_1) & \equiv & h\|_{\{ inX \}} = \langle inX, x \rangle \land h\|_{\{ inY, out \}} = \epsilon \\ \mathcal{Q}(q_2) & \equiv & \exists \alpha \beta : h\|_{\{ inX \}} = \alpha \cdot \langle inX, x \rangle \land h\|_{\{ inY \}} = \beta \cdot \langle inY, y \rangle \\ & & \land \ (\mathbf{sorted}(\hat{\alpha}) \land \mathbf{sorted}(\hat{\beta}) \Rightarrow h\|_{\{ out \}} = \mathbf{sort}(\hat{\alpha} \cdot \hat{\beta})) \\ \mathcal{Q}(q_3) & \equiv & \exists \alpha \beta : h\|_{\{ inX \}} = \alpha \land h\|_{\{ inY \}} = \beta \cdot \langle inY, y \rangle \\ & & \land \ (\mathbf{sorted}(\hat{\alpha}) \land \mathbf{sorted}(\hat{\beta}) \Rightarrow h\|_{\{ out \}} = \mathbf{sort}(\hat{\alpha} \cdot \hat{\beta})) \\ \mathcal{Q}(q_4) & \equiv & \exists \alpha \beta : h\|_{\{ inX \}} = \alpha \cdot \langle inX, x \rangle \land h\|_{\{ inY \}} = \beta \\ & & \land \ (\mathbf{sorted}(\hat{\alpha}) \land \mathbf{sorted}(\hat{\beta}) \Rightarrow h\|_{\{ out \}} = \mathbf{sort}(\hat{\alpha} \cdot \hat{\beta})) \\ \mathcal{Q}(q_t) & \equiv & \varphi \end{array}\]## The Calculus Of Communicating Systems

The Calculus of Communicating Systems (CCS):

- Is a process algebra, a simple formal language to describe concurrent systems
- Is given semantics in terms of labelled transition systems
- Was developed by Turing-award winner Robin Milner in the 1980s
- Has an abstract view of synchronization that applies well to message passing

It provides a symbolic way to describe transition diagrams, and reason about them symbolically rather than diagramatically.

### Processes

Processes in CCS are defined by equations.

For example, the equation

\[\textbf{CLOCK} = \text{tick}\]defines a process $\textbf{CLOCK}$ that simply executes the *action* “tick” and then terminates. This process corresponds to the first location in this *labelled transition system* (LTS):

An LTS is like a transition diagram, save that our transitions are just abstract actions and we have no initial or final location.

#### Action Prefixing

If $a$ is an action and $P$ is a process then $x.P$ is a process that executes $x$ before $P$. This brackets to the right, so $x.y.z.P = x.(y.(z.P))$.

For example, the equation

\[\textbf{CLOCK}_2 = \text{tick.tock}\]defines a process called $\textbf{CLOCK}_2$ that executes the action “tick” then the action “tock” and then terminates.

\[\bullet \xrightarrow{\text{tick}} \bullet \xrightarrow{\text{tock}} \bullet\]##### Stopping

The process with no transitions is $\textbf{STOP}$.

### Loops

Up to now, all processes make a finite number of transitions and then terminate. Processes that can make a infinite number of transitions can be pictured by allowing loops. We accomplish loops in CCS using *recursion*.

For example, the equation

\[\textbf{CLOCK}_3 = \text{tick.}\textbf{CLOCK}_3\]executes the action “tick” forever.

### Choice

If $P$ and $Q$ are processes then $P + Q$ is a process which can either behave as the process $P$ or the process $Q$.

We have the following identities about choice:

\[\begin{array}{lclr} P + (Q + R) & = & (P + Q) + R & \text{associativity} \\ P + Q & = & Q + P & \text{commutativity} \\ P + \textbf{STOP} & = & P & \text{neutral element} \\ P + P & = & P & \text{idempotence} \end{array}\]### Parallel Composition

If $P$ and $Q$ are proceses then $P \mid Q$ is the parallel composition of their processes, i.e. the non-deterministic interleaving of their actions.

### Synchronization

In CCS, every action $a$ has an opposing coaction $\overline{a}$ (and $\overline{\overline{a}} = a$).

It is a convention to think of an action as an *output* event and a coaction as an *input* event. If a system can execute both an action and its coaction, it *may* execute them both simultaneously by taking an *internal transition* marked by the special action $\tau$.

### Expansion Theorem

Let $P$ and $Q$ be processes. By expanding recursive definitions and using our existing equations for choice we can express $P$ and $Q$ and $n$-ary choices of action prefixes:

\[P = \sum_{i \in I} \alpha_i. \ P_i \quad \text{and} \quad Q = \sum_{j \in J} \beta_j. \ Q_j\]Then, the parallel composition can be expressed as follows:

\[P \mid Q = \sum_{i \in I} \alpha_i. \ (P_i \mid Q) + \sum_{j \in J} \beta)j. \ (P \mid Q_j) + \sum_{i \in I, j \in J, \alpha_i = \overline{\beta_j}}\]From this, many useful equations are derivable:

\[\begin{array}{lcl} P \mid Q & = & Q \mid P \\ P \mid (Q \mid R) & = & (P \mid Q) \mid R \\ P \mid \textbf{STOP} & = & P \end{array}\]### Restriction

If $P$ is a process and $a$ is an action (not $\tau$), then $P \setminus a$ is the same as the process $P$ *except* that the actions $a$ and $\overline{a}$ may not be executed. We have

### Value Passing

We introduce synchronous channels into CCS by allowing actions and coactions to take parameters.

An **action** looks like: $a(3)$, $c(15)$, $x(True)$. A **coaction** looks like: $\overline{a}(x)$, $\overline{c}(y)$, $\overline{c}(z)$.

The parameter of an action is the value to be sent, and the parameter of a coaction is the variable in which the received value is stored.

### Merge & Guards

If $P$ is a value-passing CCS process and $\varphi$ is a formula about the variables in scope, then $[\varphi]P$ is a process that executes just like $P$ if $\varphi$ holds for the current state, and like $\textbf{STOP}$ otherwise.

We can then define an $\textbf{if}$ statement like so:

\[\textbf{if} \ \varphi \ \textbf{then} \ P \ \textbf{else} \ Q \quad \equiv \quad ([\varphi].P) + ([\neg \varphi].Q)\]### Assignment

If $P$ is a process and $x$ is a variable in the state, and $e$ is an expression, then $\llbracket x := e \rrbracketP$ is the same as $P$ except that it first updates the variable $x$ to have the value $e$ before making a transition.

### Semantics

#### Operational Semantics

\[\begin{array}{ccc} \dfrac{}{a.P \xrightarrow{a} P} \ \text{Act} & \dfrac{P \xrightarrow{a} P^\prime}{P + Q \xrightarrow{a} P^\prime} \ \text{Choice}_1 & \dfrac{Q \xrightarrow{a} Q^\prime}{P + Q \xrightarrow{a} Q^\prime} \ \text{Choice}_2 \\ \rule[-4ex]{0pt}{10ex} \dfrac{P \xrightarrow{a} P^\prime}{P \mid Q \xrightarrow{a} P^\prime \mid Q} \ \text{Par}_1 & \dfrac{Q \xrightarrow{a} Q^\prime}{P \mid Q \xrightarrow{a} P \mid Q^\prime} \ \text{Par}_2 & \dfrac{P \xrightarrow{a} P^\prime \quad Q \xrightarrow{\overline{a}} Q^\prime}{P \mid Q \xrightarrow{\tau} P^\prime \mid Q^\prime} \ \text{Sync} \\ \rule[-4ex]{0pt}{10ex} & \dfrac{P \xrightarrow{a} P^\prime \quad a \not\in \{ b, \overline{b} \}}{P \setminus Q \xrightarrow{a} P^\prime \setminus b} \ \text{Restrict} & \end{array}\]##### Bisimulation Equivalence

Two processes (or locations) $P$ and $Q$ are *bisimilar* iff they can do the same actions and those actions themselves lead to bisimilar processes. All of our previous equalities can be proven by induction on the semantics here.