Certified Compilation in Agda
Using dependent types, it is possible to specify contracts for Agda programs at a very fundamental level. Today we’ll see how to transform an extremely simple tree-based calculator language to a stack-based one; and use Agda to assert the correctness of our transformation. The complete source code for this post is at 1.
The tree-based language is defined by the following data-type, with the obvious semantics:
Note how dependent types are helpful already – without them we would
not be able to elegantly assert 2, for instance, that lt
takes
two Exp ℕ
s and maps them to an Exp Bool
. The semantics are defined
by a function with the stereotypical name eval
of type Exp A → A
.
Nothing unusual here, expect perhaps the implementation of eval (a eq
b)
and eval (a lt b)
. Boolean values aren’t especially useful in
dependently-typed programming languages since they don’t carry a proof
of why a proposition is true. Operations like ≟
and ≤
therefore
evaluate to a Decidable
instance; which, apart from the result of the
comparision (the yes
and no
we’re pattern matching on here), also
contain a proof term of why the proposition is true or false. Since we
don’t need the proofs for this rather simple use-case, we merrily
ignore the same by underscores.
The stack-machine we’ll compile to is a bit more interesting – here too we use dependent types to guarantee consistency as much as possible. First of all, we use a stack that tracks the types of all its elements separately:
This signature ensures that, for instance, the first three elements of
a stack of type TypedStack (ℕ ∷ ℕ ∷Bool ∷ S)
(for some S
) will
have types ℕ
, ℕ
and Bool
.
The stack machine has six bytecodes, bcAdd
, bcSub
, bcEq
, bcLt
,
bcCond
and push
(one of the constructors for TypedStack
).
The bytecode types specify how the bytecodes will transform the stack. A program in this context is essentially a composition of some of the above bytecodes
and a term of type StackProgram X
is a program that leaves behind a
value of type X
when it’s done executing. So, for instance
is well typed but
isn’t.
Now we come to the compiler that transforms programs in the tree form to equivalent programs in the stack-machine form. The compiler itself is simple (some cases have been left out):
As a proof that the compiler actually preserves program semantics, we
look for a term, verifyCompiler
of the type {A : Set} → (exp : Exp
A) → IsCorrectFor exp
where IsCorrectFor exp = (S : List Set) →
(compile exp {S} ≡ push{_}{S} (eval exp))
. In plain English, an
expression exp is correctly compiled, denoted as IsCorrectFor exp
,
only if compiling and executing the stack-machine version is the same
as evaluating the tree version and pushing the evaluated value onto
the stack. The compiler is correct only if it correctly compiles all
expressions. If you’ve been following along, give writing
verifyCompiler yourself a shot!
Agda finds the simplest case for verifyCompiler obvious
To prove that we correctly compile a minus b
, we need to show that
compile (a minus b)
is the same function (compile
returns a stack
program, which is essentially a function from TypedStack
to
TypedStack
) as push (eval (a minus b))
. To do this, we first show
that since compile b
is the same as push (eval b)
, compile (a
minus b)
is bcSub ∘ compile a ∘ compile b
is bcSub ∘ compile a ∘
(push (eval b))
. This is shown in sub-prf₀
. Similarly, in
sub-prf₁
we show that since compile a
is the same as push (eval
a)
, bcSub ∘ compile a ∘ (push (eval b))
is the same as bcSub ∘
(push (eval a)) ∘ (push (eval b))
. These two combined, by
transitivity, gives us our proof – Agda is clever enough to see that
bcSub ∘ (push (eval a)) ∘ (push (eval b))
reduces to push ((eval a)
∸ (eval b))
.
We slightly deviate from this approach when proving correctness when
compiling for a lt b
and a eq b
– I had to case on the two
possibilities when comparing a
and b
.
Certified compilation is a rather 3 interesting 4 topic – something I’d like to explore further. A compiler for something a little more substantial than basic arithmetic should be a nice sequel.