Written by Luka Kerr on August 3, 2019

Haskell

-- f = function name
-- :: = "Of type"
-- Int = Domain
-- Bool = Codomain
-- x = Input
-- (x > 0) = Output
f :: Int -> Bool
f x = (x > 0)

Currying & Partial Application

The evaluation of a sequence of functions, each with a single argument.

-- Function that takes an `a` and returns another 
-- function taking an `a` and returning an `a`
f :: a -> (a -> a)

-- Practical example, usage: logBase 2 2.754
logBase :: Double -> (Double -> Double)

Function application associates to the left in Haskell, so:

\[\text{logBase} \ 2 \ 64 \equiv \text{(logBase 2)} \ 64\]

Tuples

A way to take multiple inputs or produce multiple outputs.

toCartesian :: (Double, Double) -> (Double, Double)
toCartesian (r, theta) = (x, y)
  where x = r * cos theta
        y = r * sin theta

Higher Order Functions

Functions that take other functions as arguments.

-- Function that takes a function and returns that function applied twice
twice :: (a -> a) -> (a -> a)
twice f a = f (f a)

Lists

Lists are singly-linked lists in Haskell. The empty list is written as [] and a list node is written as x : xs. The value x is called the head and the rest of the list xs is called the tail.

[True, False, True] :: [Bool]
[3, 2, 5+1] :: [Int]
[sin, cos] :: [Double -> Double]
[(3, 'a'), (4, 'b')] :: [(Int, Char)]

Map

A useful function is map, which, given a function, applies it to each element of a list.

map :: (a -> b) -> [a] -> [b]

Strings

The type String in Haskell is just a list of characters:

type String = [Char]

Function Composition

We used function composition to combine our functions together. The mathematical $(f \circ g)(x)$ is written (f . g) x in Haskell.

Data Types

Type Synonyms

New names for existing types.

type Point = (Float, Float)
type Vector = (Float, Float)
type Color = (Int, Int, Int, Int)

Type synonyms can be prone to errors, since for example Point and Vector are interchangeable.

Product Types

Compound types can be defined using the data keyword.

data Point = Point Float Float
           deriving (Show, Eq)

data Vector = Vector Float Float
            deriving (Show, Eq)

In the Point example there are three areas of interest:

The Haskell type system can differentiate between these types, and a type error will be thrown if Point is used instead of Vector.

Records

Identical to data types, but gives projection functions and record syntax.

data Color = Color { r :: Int
                   , g :: Int
                   , b :: Int
                   , a :: Int
                   } deriving (Show, Eq)

To access a field in a record, use the field name as a function. g (Color 255 18 120 0) gives 18.

Enumeration Types

Enumeration types are known as sum types since they have more than one constructor.

data LineStyle = Solid | Dashed | Dotted

Algebraic Data Types

Constructors for sum types can take parameters.

data Picture 
    = Path [Point] Color LineStyle
    | Circle Point Float Color LineStyle
    | Polygon [Point] Color LineStyle
    -- ...and so on

Recursive & Parametric Types

Data types can also be defined with parameters and recursively.

-- Maybe type
data Maybe a = Just a | Nothing

-- Lists
data List a = Nil | Cons a (List a)

-- Natural numbers
data Nat = Zero | Succ Nat

Type Classes

Type classes are constraints that work on multiple types. They can be thought of as a set of types for which certain operations are implemented.

Some examples of type classes are Ord, Eq and Show.

Show

-- Defining a type class
class Show a where
  show :: a -> String

-- Defining an instance of Show for the type Bool
instance Show Bool where
  show True = "True"
  show False = "False"

Read

class Read a where
  read :: String -> a

Ord

class Ord a where
  (<=) :: a -> a -> Bool

Ord instances should satisfy the properties of a partial order:

Eq

class Eq a where
  (==) :: a -> a -> Bool

Eq instances should satisfy the properties of an equivalence relation:

Semigroup

A semigroup is a pair of a set $S$ and an operation $\bullet : S \to S \to S$ where the operation $\bullet$ is associative.

There exists a type class in Haskell for Semigroup, where the associativity property must be enforced.

class Semigroup a where
  (<>) :: a -> a -> a

Some examples of Semigroup instances are $(\mathbb{N}, +)$ and $(\mathbb{B}, \land)$.

Monoid

A monoid is a semigroup $(S, \bullet)$ equiped with a special identity element $z \in S$ such that $x \bullet z = x$ and $z \bullet y = y$ for all $x, y \in S$.

In Haskell, the identity element is a function named mempty.

class (Semigroup a) => Monoid a where
  mempty :: a

Some examples of Monoid instances are $(\mathbb{B}, \land), z = \top$ and $(\mathbb{N}, +), z = 0$.

Newtypes

A newtype declaration is similar to a data declaration except that there can be only one constructor and it must take exactly one argument.

newtype Score = S Integer

instance Semigroup Score where
  S x <> S y = S (x + y)

instance Monoid Score where
  mempty = S 0

Newtypes are represented identically to their type argument, but are treated differently in the type checker, allowing for a zero cost abstraction at runtime.

Property Based Testing

Key idea: generate random input values, and test properties by running them.

QuickCheck

Haskell’s QuickCheck is the first library ever invented for property-based testing.

PBT vs Unit Testing

Test Data Generation

Data which can be generated randomly is represented by the following type class:

class Arbitrary a where
  arbitrary :: Gen a
  shrink :: a -> [a]

The shrink function is for when test cases fail. If a given input x fails, QuickCheck will try all inputs in shrink x, repeating the process until the smallest possible input is found.

Coverage

Types of code coverage include:

Lazy Evaluation

Haskell is lazily evaluated, also called call-by-need. This means that expressions are only evaluated when they are needed to compute a result for the user.

We can force the previous program to evaluate its accumulator by using a bang pattern, or the primitive operation seq:

sumTo :: Integer -> Integer -> Integer
sumTo !a 0 = a
sumTo !a n = sumTo (a + n) (n - 1)

Infinite Data Structures

Laziness lets us define data structures that extend infinitely.

-- infinite list of 1's
ones = 1 : ones

-- infinite list of natural numbers
naturals = 0 : map (1+) naturals

Data Invariants

Data invariants are properties that pertain to a particular data type.

Whenever we use operations on that data type, we want to know that our data invariants are maintained.

For a given data type X we define a wellformedness predicate wf :: X -> Bool. For a given value x :: X, wf x returns true iff our data invariants hold for the value x.

Modules

A Haskell program will usually be made up of many modules, each of which exports one or more data types.

Modules usually provide a set of functions for a data type X:

Abstract Data Types

An ADT is a data type where the implementation details of the type and its associated operations are hidden.

-- Only the Dict type and dict related functions are exported
module Dictionary 
  ( Dict
  , empty
  , insert
  , get
  ) where

type Word = String
type Definition = String

newtype Dict = [(Word, Definition)]

empty  :: Dict
get    :: Word -> Dict -> Maybe Definition
insert :: Word -> Definition -> Dict -> Dict

Data Refinement

Refinement Relations

To connect a model with a type we can define a refinement relation, that tells us whether the model and concrete type represent the same structure conceptually.

-- Model is some hashmap from word to definition
-- Concrete type is Dict
rel :: Dict -> Map -> Bool

Abstraction Functions

An abstraction function takes a concrete type and produces an abstract model.

toAbstract :: Dict -> Map

Refinement & Specifications

In general, all functional correctness specifications can be expressed as:

  1. All data invariants are maintained
  2. The implementation is a refinement of an abstract correctness model

Higher Kinds

Haskell is comprised of two languages:

This type-level language itself has a type system, in which types are given their own types called kinds.

The most basic kind is written as *.

Using ghci we can find the kind of a type by writing :k SomeType.

Functors

Functor is over types of kind * -> *.

class Functor f where
  fmap :: (a -> b) -> f a -> f b

Functor Laws

-- Identity
fmap id == id

-- Composition
fmap f . fmap g == fmap (f . g)

Applicatives

class Functor f => Applicative f where
  pure :: a -> f a
  (<*>) :: f (a -> b) -> f a -> f b

Applicative Laws

-- Identity
pure id <*> v = v

-- Homomorphism
pure f <*> pure x = pure (f x)

-- Interchange
u <*> pure y = pure ($ y) <*> u

-- Composition
pure (.) <*> u <*> v <*> w = u <*> (v <*> w)

Monads

Monads are types m where we can sequentially compose functions of the form a -> m b.

class Applicative m => Monad m where
  return :: a -> m a
  (>>=) :: m a -> (a -> m b) -> m b

Monad Laws

We can define a composition operator with (>>=):

(<=<) :: (b -> m c) -> (a -> m b) -> (a -> m c)
(f <=< g) x = g x >>= f

Then,

-- Associativity
f <=< (g <=< x) == (f <=< g) <=< x 

-- Left identity
pure <=< f == f

-- Right identity
f <=< pure == f

All Monad instances give rise to an Applicative instance, because we can define <*> in terms of >>=:

mf <*> mx = mf >>= \f -> mx >>= \x -> pure (f x)

Do Notation

do x <- y
   z
   
-- Equivalent to

y >>= \x -> do z

-- And

do x
   z
   
-- Equivalent to

x >>= \_ -> do y

Effects

Effects are observable phenomena from the execution of a program.

For example:

// Memory effects
int *p = ...
*p = *p + 1;

// IO
c = getChar();
printf("%d",32);

// Non-termination
while (1) {}

// Control flow
throw new Exception();

Internal & External Effects

An external effect is an effect that is observable outside the function. Internal effects are not observable from outside.

Purity

A function with no external effects is called a pure function. A pure function is the mathematical notion of a function. That is, a function of type a -> b is fully specified by a mapping from all elements of the domain type a to the codomain type b.

Haskell functions are technically not pure. They can loop infinitely, throw exceptions and force evaluation of unevaluated expressions. Despite the impurity of Haskell functions, we can often reason as though they are pure. Hence we call Haskell a purely functional language.

State Monad

newtype State s a = State (s -> (s, a))

get :: State s s
put :: s -> State s ()
modify :: (s -> s) -> State s ()  

IO Monad

A procedure that performs some side effects, returning a result of type a is written as IO a.

type IO a  =  RealWorld -> (a, RealWorld)

A few functions that return an IO are

Static Assurance

Methods Of Assurance

Static Dynamic Hybrid
Static Analysers Monitors Gradual Types
Proofs Watchdogs Contracts
Types Testing  
Model Checkers assert()  

Static means of assurance analyse a program without running it.

Static vs Dynamic

Types

Types are the most widely used kind of formal verification in programming today.

Phantom Types

A type parameter is phantom if it does not appear in the right hand side of the type definition:

newtype Size x = S Int

We can define a smart constructor that specialises the type parameter. This allows for type validation when using phantom types.

GADTs

Generalised Algebraic Datatypes (GADTs) is an extension to Haskell that, among other things, allows data types to be specified by writing the types of their constructors.

{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}

-- Unary natural numbers, e.g. 3 is S (S (S Z)) 
data Nat = Z | S Nat

-- is the same as
data Nat :: * where
  Z :: Nat
  S :: Nat -> Nat

Lists

We could define a list type using GADT syntax:

{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}

data List (a :: *) :: * where
  Nil  :: List a
  Cons :: a -> List a -> List a

head' :: List a -> a
head' (Cons x xs) = x

tail' :: List a -> List a
tail' (Cons x xs) = xs

One problem with this is functions such as head' and tail' are partial, in that they are not defined for all elements in their domain (specifically Nil).

Vectors

The problem of partial functions can be solved by tracking properties about our type and verifying these properties at compile time. An example is with vectors:

{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}

data Nat = Z | S Nat

data Vec (a :: *) :: Nat -> * where
  Nil :: Vec a Z
  Cons :: a -> Vec a n -> Vec a (S n)

head' :: Vec a (S n) -> a
head' (Cons x xs) = x

tail' :: Vec a (S n) -> Vec a n
tail' (Cons x xs) = xs

In this implementation, head' and tail' are now total, and the Vec type’s properties are verified statically by the compiler.

Tradeoffs

Type Families

Type level functions, also called type families, can be defined in Haskell using the TypeFamilies language extension.

{-# LANGUAGE TypeFamilies #-}

data Nat = Z | S Nat

type family Plus (x :: Nat) (y :: Nat) :: Nat where
  Plus Z     y = y
  Plus (S x) y = S (Plus x y)

append' :: Vec a m -> Vec a n -> Vec a (Plus m n)
append' Nil ys = ys
append' (Cons x xs) ys = Cons x (append' xs ys)

In this example, a type level function Plus has been defined that operates on two type level Nats.

The type level function Plus has been used above in the type of append' to ensure that when appending one Vec with another, the resulting Vec has the same length.

Logic

Natural Deduction

We can specify a logical system as a deductive system by providing a set of rules and axioms that describe how to prove various connectives. Each connective typically has introduction and elimination rules.

Rules

Literals $\top, \bot$
\[\dfrac{}{\Gamma \vdash \top} \top\text{-I} \qquad \dfrac{\Gamma \vdash \bot}{\Gamma \vdash P} \ \text{X}\]
Conjunction $\land$
\[\dfrac{\Gamma \vdash A \qquad \Gamma \vdash B}{\Gamma \vdash A \land B} \land\text{-I} \qquad \dfrac{\Gamma \vdash A \land B}{\Gamma \vdash A} \land\text{-E}_1 \qquad \dfrac{\Gamma \vdash A \land B}{\Gamma \vdash B} \land\text{-E}_2\]
Disjunction $\lor$
\[\dfrac{\Gamma \vdash A}{\Gamma \vdash A \lor B} \lor\text{-I}_1 \qquad \dfrac{\Gamma \vdash B}{\Gamma \vdash A \lor B} \lor\text{-I}_2 \qquad \dfrac{\Gamma \vdash A \lor B \qquad A, \Gamma \vdash P \qquad B, \Gamma \vdash P}{\Gamma \vdash P} \lor\text{-E}\]
Implication $\to$
\[\dfrac{A, \Gamma \vdash B}{\Gamma \vdash A \to B} \to\text{-I} \qquad \dfrac{\Gamma \vdash A \to B \qquad \Gamma \vdash A}{\Gamma \vdash B} \to\text{-E} \qquad\]

Typed Lambda Calculus

The simply typed lambda calculus is a programming language consisting only of lambdas ($\lambda x . \ e$) to define functions or bind variables and a small set of build in types.

Typing Rules

\[\dfrac{x :: A, \Gamma \vdash e :: B}{\Gamma \vdash \lambda x . \ e :: A \to B} \qquad \dfrac{\Gamma \vdash e_1 :: A \to B \qquad \Gamma \vdash e_2 :: A}{\Gamma \vdash e_1 \ e_2 :: B}\]

Types

Unit Types $()$

\[\dfrac{}{\Gamma \vdash () :: ()}\]

Product Types $(A, B)$

\[\dfrac{\Gamma \vdash e_1 :: A \qquad \Gamma \vdash e_2 :: B}{\Gamma \vdash (e_1, e_2) :: (A, B)} \qquad \dfrac{\Gamma \vdash e :: (A, B)}{\Gamma \vdash \mathtt{fst} \ e :: A} \qquad \dfrac{\Gamma \vdash e :: (A, B)}{\Gamma \vdash \mathtt{snd} \ e :: B}\]

Sum Types $\mathtt{Either} \ A \ B$

\[\dfrac{\Gamma \vdash e :: A}{\Gamma \vdash \mathtt{Left} \ e :: \mathtt{Either} \ A \ B} \qquad \dfrac{\Gamma \vdash e :: B}{\Gamma \vdash \mathtt{Right} \ e :: \mathtt{Either} \ A \ B}\] \[\dfrac{\Gamma \vdash e :: \mathtt{Either} \ A \ B \qquad x :: A, \Gamma \vdash e_1 :: P \qquad y :: B, \Gamma \vdash e_2 :: P}{\Gamma \vdash (\mathbf{case} \ e \ \mathbf{of} \ \mathtt{Left} \ x \to e_1; \mathtt{Right} \ y \to e_2) :: P}\]

Empty Type $\mathtt{Void}$

The empty type $\mathtt{Void}$ has no inhabitants, and so there is no way to construct it.

\[\dfrac{\Gamma \vdash e :: \mathtt{Void}}{\Gamma \vdash \mathbf{absurd} \ e :: P}\]

Curry-Howard Correspondence

Programming Logic
Types Propositions
Programs Proofs
Evaluation Proof Simplification
Constructive Logic Typed $\lambda$-Calculus
Classical Logic Continuations
Modal Logic Monads
Linear Logic Linear Types, Session Types
Separation Logic Region Types

Algebraic Type Isomorphism

Semiring Structure

These types we have defined form an algebraic structure called a commutative semiring.

Laws

Associativity
\[\begin{array}{rcl} \mathtt{Either} \ (\mathtt{Either} \ A \ B) \ C & \simeq & \mathtt{Either} \ A \ (\mathtt{Either} \ B \ C) \\ ((A, B), C) & \simeq & (A, (B, C)) \\ \end{array}\]
Identity
\[\begin{array}{rcl} \mathtt{Either \ Void} \ A & \simeq & A \\ ((), A) & \simeq & A \end{array}\]
Commutativity
\[\begin{array}{rcl} \mathtt{Either} \ A \ B & \simeq & \mathtt{Either} \ B \ A \\ (A, B) & \simeq & (B, A) \end{array}\]
Distributivity
\[\begin{array}{rcl} (A, \mathtt{Either} \ B \ C) & \simeq & \mathtt{Either} \ (A, B) \ (A, C) \end{array}\]
Absorption
\[\begin{array}{rcl} (\mathtt{Void}, A) & \simeq & \mathtt{Void} \end{array}\]

Isomorphism

Two types $A$ and $B$ are isomorphic, written $A \simeq B$, if there exists a bijection between them. This means that for each value in $A$ we can find a unique value in $B$ and vice versa.

Polymorphism & Parametricity

Type Quantifiers

The quantification over type variables is called parametric polymorphism.

The type quantifier $\forall$ corresponds to a universal quantifier $\forall$ but it is not the same as the $\forall$ from first-order logic. First-order logic quantifiers range over a set of individuals, whereas these quantifiers range over propositions (types) themselves. It is analogous to second-order logic.

Generality

A type $A$ is more general than a type $B$, often written $A \sqsubseteq B$, if type variables in $A$ can be instantiated to give the type $B$.

For example, with functions:

\[\forall a . \ a \quad \sqsubseteq \quad \forall x \ y . \ x \to y \quad \sqsubseteq \quad \forall z . \ z \to z \quad \sqsubseteq \quad \mathtt{Int} \to \mathtt{Int}\]

Parametricity

The principle of parametricity states that the result of polymorphic functions cannot depend on values of an abstracted type.

More formally, suppose there exists a polymorphic function $g$ that is polymorphic on type $a$. If run any arbitrary function $f :: a \to a$ on all the $a$ values in the input of $g$, that will give the same results as running $g$ first, then $f$ on all the a values of the output.