Written by Luka Kerr on August 3, 2019
Haskell
-- f = function name
-- :: = "Of type"
-- Int = Domain
-- Bool = Codomain
-- x = Input
-- (x > 0) = Output
f :: Int -> Bool
f x = (x > 0)
Currying & Partial Application
The evaluation of a sequence of functions, each with a single argument.
-- Function that takes an `a` and returns another
-- function taking an `a` and returning an `a`
f :: a -> (a -> a)
-- Practical example, usage: logBase 2 2.754
logBase :: Double -> (Double -> Double)
Function application associates to the left in Haskell, so:
\[\text{logBase} \ 2 \ 64 \equiv \text{(logBase 2)} \ 64\]Tuples
A way to take multiple inputs or produce multiple outputs.
toCartesian :: (Double, Double) -> (Double, Double)
toCartesian (r, theta) = (x, y)
where x = r * cos theta
y = r * sin theta
Higher Order Functions
Functions that take other functions as arguments.
-- Function that takes a function and returns that function applied twice
twice :: (a -> a) -> (a -> a)
twice f a = f (f a)
Lists
Lists are singly-linked lists in Haskell. The empty list is written as []
and a list node is
written as x : xs
. The value x
is called the head and the rest of the list xs is
called the tail.
[True, False, True] :: [Bool]
[3, 2, 5+1] :: [Int]
[sin, cos] :: [Double -> Double]
[(3, 'a'), (4, 'b')] :: [(Int, Char)]
Map
A useful function is map
, which, given a function, applies it to each element of a list.
map :: (a -> b) -> [a] -> [b]
Strings
The type String in Haskell is just a list of characters:
type String = [Char]
Function Composition
We used function composition to combine our functions together.
The mathematical $(f \circ g)(x)$ is written (f . g) x
in Haskell.
Data Types
Type Synonyms
New names for existing types.
type Point = (Float, Float)
type Vector = (Float, Float)
type Color = (Int, Int, Int, Int)
Type synonyms can be prone to errors, since for example Point
and Vector
are interchangeable.
Product Types
Compound types can be defined using the data
keyword.
data Point = Point Float Float
deriving (Show, Eq)
data Vector = Vector Float Float
deriving (Show, Eq)
In the Point
example there are three areas of interest:
data Point
, wherePoint
is the type name= Point
, wherePoint
is the constructor nameFloat Float
, where both types are the constructor argument types
The Haskell type system can differentiate between these types, and a type error will be thrown
if Point
is used instead of Vector
.
Records
Identical to data
types, but gives projection functions and record syntax.
data Color = Color { r :: Int
, g :: Int
, b :: Int
, a :: Int
} deriving (Show, Eq)
To access a field in a record, use the field name as a function.
g (Color 255 18 120 0)
gives 18
.
Enumeration Types
Enumeration types are known as sum types since they have more than one constructor.
data LineStyle = Solid | Dashed | Dotted
Algebraic Data Types
Constructors for sum types can take parameters.
data Picture
= Path [Point] Color LineStyle
| Circle Point Float Color LineStyle
| Polygon [Point] Color LineStyle
-- ...and so on
Recursive & Parametric Types
Data types can also be defined with parameters and recursively.
-- Maybe type
data Maybe a = Just a | Nothing
-- Lists
data List a = Nil | Cons a (List a)
-- Natural numbers
data Nat = Zero | Succ Nat
Type Classes
Type classes are constraints that work on multiple types. They can be thought of as a set of types for which certain operations are implemented.
Some examples of type classes are Ord
, Eq
and Show
.
Show
-- Defining a type class
class Show a where
show :: a -> String
-- Defining an instance of Show for the type Bool
instance Show Bool where
show True = "True"
show False = "False"
Read
class Read a where
read :: String -> a
Ord
class Ord a where
(<=) :: a -> a -> Bool
Ord
instances should satisfy the properties of a partial order:
- Reflexivity:
x <= x
- Transitivity: If
x <= y
andy <= z
thenx <= z
- Antisymmetry: If
x <= y
andy <= x
thenx == y
- Totality: Either
x <= y
ory <= z
Eq
class Eq a where
(==) :: a -> a -> Bool
Eq
instances should satisfy the properties of an equivalence relation:
- Reflexivity:
x == x
- Transitivity: If
x == y
andy == z
thenx == z
- Symmetry: If
x == y
theny == x
Semigroup
A semigroup is a pair of a set $S$ and an operation $\bullet : S \to S \to S$ where the operation $\bullet$ is associative.
There exists a type class in Haskell for Semigroup
, where the associativity property must
be enforced.
class Semigroup a where
(<>) :: a -> a -> a
Some examples of Semigroup
instances are $(\mathbb{N}, +)$ and $(\mathbb{B}, \land)$.
Monoid
A monoid is a semigroup $(S, \bullet)$ equiped with a special identity element $z \in S$ such that $x \bullet z = x$ and $z \bullet y = y$ for all $x, y \in S$.
In Haskell, the identity element is a function named mempty
.
class (Semigroup a) => Monoid a where
mempty :: a
Some examples of Monoid
instances are $(\mathbb{B}, \land), z = \top$ and
$(\mathbb{N}, +), z = 0$.
Newtypes
A newtype
declaration is similar to a data
declaration except that there can be only one
constructor and it must take exactly one argument.
newtype Score = S Integer
instance Semigroup Score where
S x <> S y = S (x + y)
instance Monoid Score where
mempty = S 0
Newtypes are represented identically to their type argument, but are treated differently in the type checker, allowing for a zero cost abstraction at runtime.
Property Based Testing
Key idea: generate random input values, and test properties by running them.
QuickCheck
Haskell’s QuickCheck is the first library ever invented for property-based testing.
PBT vs Unit Testing
- Properties are more compact and describe more cases than unit tests
- PBT relies on randomly generated test data which may not cover edge cases
- By increasing the number of random inputs, code coverage is improved in PBT
Test Data Generation
Data which can be generated randomly is represented by the following type class:
class Arbitrary a where
arbitrary :: Gen a
shrink :: a -> [a]
The shrink
function is for when test cases fail. If a given input x
fails, QuickCheck
will try all inputs in shrink x
, repeating the process until the smallest possible input is
found.
Coverage
Types of code coverage include:
- Branch/Decision Coverage: are all conditional branches executed?
- Function Coverage: are all functions executed?
- Entry/Exit Coverage: are all function calls executed?
- Path Coverage: are all behaviours executed?
- Statement/Expression Coverage: are all expressions executed?
Lazy Evaluation
Haskell is lazily evaluated, also called call-by-need. This means that expressions are only evaluated when they are needed to compute a result for the user.
We can force the previous program to evaluate its accumulator by using a bang pattern, or the
primitive operation seq
:
sumTo :: Integer -> Integer -> Integer
sumTo !a 0 = a
sumTo !a n = sumTo (a + n) (n - 1)
Infinite Data Structures
Laziness lets us define data structures that extend infinitely.
-- infinite list of 1's
ones = 1 : ones
-- infinite list of natural numbers
naturals = 0 : map (1+) naturals
Data Invariants
Data invariants are properties that pertain to a particular data type.
Whenever we use operations on that data type, we want to know that our data invariants are maintained.
For a given data type X
we define a wellformedness predicate wf :: X -> Bool
. For a given value x :: X
, wf x
returns true iff our data invariants hold for the value x
.
Modules
A Haskell program will usually be made up of many modules, each of which exports one or more data types.
Modules usually provide a set of functions for a data type X
:
- Construction:
c :: ... -> X
- Querying:
q :: X -> ...
- Updating:
u :: ... X -> X
Abstract Data Types
An ADT is a data type where the implementation details of the type and its associated operations are hidden.
-- Only the Dict type and dict related functions are exported
module Dictionary
( Dict
, empty
, insert
, get
) where
type Word = String
type Definition = String
newtype Dict = [(Word, Definition)]
empty :: Dict
get :: Word -> Dict -> Maybe Definition
insert :: Word -> Definition -> Dict -> Dict
Data Refinement
Refinement Relations
To connect a model with a type we can define a refinement relation, that tells us whether the model and concrete type represent the same structure conceptually.
-- Model is some hashmap from word to definition
-- Concrete type is Dict
rel :: Dict -> Map -> Bool
Abstraction Functions
An abstraction function takes a concrete type and produces an abstract model.
toAbstract :: Dict -> Map
Refinement & Specifications
In general, all functional correctness specifications can be expressed as:
- All data invariants are maintained
- The implementation is a refinement of an abstract correctness model
Higher Kinds
Haskell is comprised of two languages:
- The value-level language, consisting of expressions such as
if
,let
,where
etc - The type-level language, consisting of types
Int
,Bool
,(->)
,Maybe
etc
This type-level language itself has a type system, in which types are given their own types called kinds.
The most basic kind is written as *
.
- Types such as
Int
andBool
have kind*
- Types such as
Maybe
have kind* -> *
- Types such as
Either
have kind* -> * -> *
Using ghci
we can find the kind of a type by writing :k SomeType
.
Functors
Functor
is over types of kind * -> *
.
class Functor f where
fmap :: (a -> b) -> f a -> f b
Functor Laws
-- Identity
fmap id == id
-- Composition
fmap f . fmap g == fmap (f . g)
Applicatives
class Functor f => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
Applicative Laws
-- Identity
pure id <*> v = v
-- Homomorphism
pure f <*> pure x = pure (f x)
-- Interchange
u <*> pure y = pure ($ y) <*> u
-- Composition
pure (.) <*> u <*> v <*> w = u <*> (v <*> w)
Monads
Monads are types m
where we can sequentially compose functions of the form a -> m b
.
class Applicative m => Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
Monad Laws
We can define a composition operator with (>>=)
:
(<=<) :: (b -> m c) -> (a -> m b) -> (a -> m c)
(f <=< g) x = g x >>= f
Then,
-- Associativity
f <=< (g <=< x) == (f <=< g) <=< x
-- Left identity
pure <=< f == f
-- Right identity
f <=< pure == f
All Monad instances give rise to an Applicative instance, because we can define <*>
in terms of >>=
:
mf <*> mx = mf >>= \f -> mx >>= \x -> pure (f x)
Do Notation
do x <- y
z
-- Equivalent to
y >>= \x -> do z
-- And
do x
z
-- Equivalent to
x >>= \_ -> do y
Effects
Effects are observable phenomena from the execution of a program.
For example:
// Memory effects
int *p = ...
*p = *p + 1;
// IO
c = getChar();
printf("%d",32);
// Non-termination
while (1) {}
// Control flow
throw new Exception();
Internal & External Effects
An external effect is an effect that is observable outside the function. Internal effects are not observable from outside.
Purity
A function with no external effects is called a pure function. A pure function is the mathematical notion of a function. That is, a function of type a -> b
is fully specified by a mapping from all elements of the domain type a
to the codomain type b
.
Haskell functions are technically not pure. They can loop infinitely, throw exceptions and force evaluation of unevaluated expressions. Despite the impurity of Haskell functions, we can often reason as though they are pure. Hence we call Haskell a purely functional language.
State Monad
newtype State s a = State (s -> (s, a))
get :: State s s
put :: s -> State s ()
modify :: (s -> s) -> State s ()
IO Monad
A procedure that performs some side effects, returning a result of type a
is written as IO a
.
type IO a = RealWorld -> (a, RealWorld)
A few functions that return an IO
are
getChar :: IO Char
readLine :: IO String
putStrLn :: String -> IO ()
Static Assurance
Methods Of Assurance
Static | Dynamic | Hybrid |
---|---|---|
Static Analysers | Monitors | Gradual Types |
Proofs | Watchdogs | Contracts |
Types | Testing | |
Model Checkers | assert() |
Static means of assurance analyse a program without running it.
Static vs Dynamic
- Static checks can be exhaustive, i.e. they are able to analyse all possible executions of a program
- However, some properties cannot be checked statically in general (halting problem), or are intractable to feasibly check statically (state space explosion)
- Dynamic checks cannot be exhaustive, but can be used to check some properties where static methods are unsuitable
Types
Types are the most widely used kind of formal verification in programming today.
- They are checked automatically by the compiler
- They can be extended to encompass properties and proof systems with very high expressivity
- They are an exhaustive analysis
Phantom Types
A type parameter is phantom if it does not appear in the right hand side of the type definition:
newtype Size x = S Int
We can define a smart constructor that specialises the type parameter. This allows for type validation when using phantom types.
GADTs
Generalised Algebraic Datatypes (GADTs) is an extension to Haskell that, among other things, allows data types to be specified by writing the types of their constructors.
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
-- Unary natural numbers, e.g. 3 is S (S (S Z))
data Nat = Z | S Nat
-- is the same as
data Nat :: * where
Z :: Nat
S :: Nat -> Nat
Lists
We could define a list type using GADT syntax:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
data List (a :: *) :: * where
Nil :: List a
Cons :: a -> List a -> List a
head' :: List a -> a
head' (Cons x xs) = x
tail' :: List a -> List a
tail' (Cons x xs) = xs
One problem with this is functions such as head'
and tail'
are partial, in that they are not defined for all elements in their domain (specifically Nil
).
Vectors
The problem of partial functions can be solved by tracking properties about our type and verifying these properties at compile time. An example is with vectors:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
data Nat = Z | S Nat
data Vec (a :: *) :: Nat -> * where
Nil :: Vec a Z
Cons :: a -> Vec a n -> Vec a (S n)
head' :: Vec a (S n) -> a
head' (Cons x xs) = x
tail' :: Vec a (S n) -> Vec a n
tail' (Cons x xs) = xs
In this implementation, head'
and tail'
are now total, and the Vec
type’s properties are verified statically by the compiler.
Tradeoffs
- It can be difficult to convince the Haskell type checker that your code is correct
- Type-level encodings can make types more verbose and programs harder to understand
- Sometimes excessively detailed types can make type-checking very slow
Type Families
Type level functions, also called type families, can be defined in Haskell using the TypeFamilies
language extension.
{-# LANGUAGE TypeFamilies #-}
data Nat = Z | S Nat
type family Plus (x :: Nat) (y :: Nat) :: Nat where
Plus Z y = y
Plus (S x) y = S (Plus x y)
append' :: Vec a m -> Vec a n -> Vec a (Plus m n)
append' Nil ys = ys
append' (Cons x xs) ys = Cons x (append' xs ys)
In this example, a type level function Plus
has been defined that operates on two type level Nat
s.
The type level function Plus
has been used above in the type of append'
to ensure that when appending one Vec
with another, the resulting Vec
has the same length.
Logic
Natural Deduction
We can specify a logical system as a deductive system by providing a set of rules and axioms that describe how to prove various connectives. Each connective typically has introduction and elimination rules.
Rules
Literals $\top, \bot$
\[\dfrac{}{\Gamma \vdash \top} \top\text{-I} \qquad \dfrac{\Gamma \vdash \bot}{\Gamma \vdash P} \ \text{X}\]Conjunction $\land$
\[\dfrac{\Gamma \vdash A \qquad \Gamma \vdash B}{\Gamma \vdash A \land B} \land\text{-I} \qquad \dfrac{\Gamma \vdash A \land B}{\Gamma \vdash A} \land\text{-E}_1 \qquad \dfrac{\Gamma \vdash A \land B}{\Gamma \vdash B} \land\text{-E}_2\]Disjunction $\lor$
\[\dfrac{\Gamma \vdash A}{\Gamma \vdash A \lor B} \lor\text{-I}_1 \qquad \dfrac{\Gamma \vdash B}{\Gamma \vdash A \lor B} \lor\text{-I}_2 \qquad \dfrac{\Gamma \vdash A \lor B \qquad A, \Gamma \vdash P \qquad B, \Gamma \vdash P}{\Gamma \vdash P} \lor\text{-E}\]Implication $\to$
\[\dfrac{A, \Gamma \vdash B}{\Gamma \vdash A \to B} \to\text{-I} \qquad \dfrac{\Gamma \vdash A \to B \qquad \Gamma \vdash A}{\Gamma \vdash B} \to\text{-E} \qquad\]Typed Lambda Calculus
The simply typed lambda calculus is a programming language consisting only of lambdas ($\lambda x . \ e$) to define functions or bind variables and a small set of build in types.
Typing Rules
\[\dfrac{x :: A, \Gamma \vdash e :: B}{\Gamma \vdash \lambda x . \ e :: A \to B} \qquad \dfrac{\Gamma \vdash e_1 :: A \to B \qquad \Gamma \vdash e_2 :: A}{\Gamma \vdash e_1 \ e_2 :: B}\]Types
Unit Types $()$
\[\dfrac{}{\Gamma \vdash () :: ()}\]Product Types $(A, B)$
\[\dfrac{\Gamma \vdash e_1 :: A \qquad \Gamma \vdash e_2 :: B}{\Gamma \vdash (e_1, e_2) :: (A, B)} \qquad \dfrac{\Gamma \vdash e :: (A, B)}{\Gamma \vdash \mathtt{fst} \ e :: A} \qquad \dfrac{\Gamma \vdash e :: (A, B)}{\Gamma \vdash \mathtt{snd} \ e :: B}\]Sum Types $\mathtt{Either} \ A \ B$
\[\dfrac{\Gamma \vdash e :: A}{\Gamma \vdash \mathtt{Left} \ e :: \mathtt{Either} \ A \ B} \qquad \dfrac{\Gamma \vdash e :: B}{\Gamma \vdash \mathtt{Right} \ e :: \mathtt{Either} \ A \ B}\] \[\dfrac{\Gamma \vdash e :: \mathtt{Either} \ A \ B \qquad x :: A, \Gamma \vdash e_1 :: P \qquad y :: B, \Gamma \vdash e_2 :: P}{\Gamma \vdash (\mathbf{case} \ e \ \mathbf{of} \ \mathtt{Left} \ x \to e_1; \mathtt{Right} \ y \to e_2) :: P}\]Empty Type $\mathtt{Void}$
The empty type $\mathtt{Void}$ has no inhabitants, and so there is no way to construct it.
\[\dfrac{\Gamma \vdash e :: \mathtt{Void}}{\Gamma \vdash \mathbf{absurd} \ e :: P}\]Curry-Howard Correspondence
Programming | Logic |
---|---|
Types | Propositions |
Programs | Proofs |
Evaluation | Proof Simplification |
Constructive Logic | Typed $\lambda$-Calculus |
Classical Logic | Continuations |
Modal Logic | Monads |
Linear Logic | Linear Types, Session Types |
Separation Logic | Region Types |
Algebraic Type Isomorphism
Semiring Structure
These types we have defined form an algebraic structure called a commutative semiring.
Laws
Associativity
\[\begin{array}{rcl} \mathtt{Either} \ (\mathtt{Either} \ A \ B) \ C & \simeq & \mathtt{Either} \ A \ (\mathtt{Either} \ B \ C) \\ ((A, B), C) & \simeq & (A, (B, C)) \\ \end{array}\]Identity
\[\begin{array}{rcl} \mathtt{Either \ Void} \ A & \simeq & A \\ ((), A) & \simeq & A \end{array}\]Commutativity
\[\begin{array}{rcl} \mathtt{Either} \ A \ B & \simeq & \mathtt{Either} \ B \ A \\ (A, B) & \simeq & (B, A) \end{array}\]Distributivity
\[\begin{array}{rcl} (A, \mathtt{Either} \ B \ C) & \simeq & \mathtt{Either} \ (A, B) \ (A, C) \end{array}\]Absorption
\[\begin{array}{rcl} (\mathtt{Void}, A) & \simeq & \mathtt{Void} \end{array}\]Isomorphism
Two types $A$ and $B$ are isomorphic, written $A \simeq B$, if there exists a bijection between them. This means that for each value in $A$ we can find a unique value in $B$ and vice versa.
Polymorphism & Parametricity
Type Quantifiers
The quantification over type variables is called parametric polymorphism.
The type quantifier $\forall$ corresponds to a universal quantifier $\forall$ but it is not the same as the $\forall$ from first-order logic. First-order logic quantifiers range over a set of individuals, whereas these quantifiers range over propositions (types) themselves. It is analogous to second-order logic.
Generality
A type $A$ is more general than a type $B$, often written $A \sqsubseteq B$, if type variables in $A$ can be instantiated to give the type $B$.
For example, with functions:
\[\forall a . \ a \quad \sqsubseteq \quad \forall x \ y . \ x \to y \quad \sqsubseteq \quad \forall z . \ z \to z \quad \sqsubseteq \quad \mathtt{Int} \to \mathtt{Int}\]Parametricity
The principle of parametricity states that the result of polymorphic functions cannot depend on values of an abstracted type.
More formally, suppose there exists a polymorphic function $g$ that is polymorphic on type $a$. If run any arbitrary function $f :: a \to a$ on all the $a$ values in the input of $g$, that will give the same results as running $g$ first, then $f$ on all the a values of the output.