In my last post Type Safety in Five Easy Lemmas, I made a claim that the formulation of the operational semantics of a language makes a big difference regarding how many lemmas, and how tedious, the proof of type safety becomes. While I showed that the particular semantics that I used led to a simple proof, with just five easy lemmas, I didn't compare it to the alternatives. Perhaps there's an even better alternative! In this post I discuss the alternatives that I know about and why they lead to more lemmas and more tedium. This post is organized by design decision.

### Why A-normal form?

The syntax of the little language is in A-normal form, that is, it's flattened
out so that expressions don't have sub-expressions, but instead, more complex
computations have to be built up from several variable assignments.
It's much more common to see type-safety proofs on languages that are not
in A-normal form, that is, languages with arbitrary nesting of expressions.
This nesting of expressions can be handled in one of two ways in the
operational semantics: either with extra so-called congruence reduction rules
or by using evaluation contexts. The congruence rule approach adds many more
cases to the " is safe"
lemma (or equivalently, the progress and preservation lemmas). The
evaluation contexts approach requires an extra lemma often called
the *unique decomposition* lemma.

You might be thinking, real languages aren't in A-normal form, so one would really need to compile to A-normal form and prove that the compilation is type preserving (as we did in the Work Horse post). This is true, but I'd argue that it's nice to have a separation of concerns and take care of intra-expression control flow in one function and lemma instead of dealing with it throughout the operational semantics and proof of type safety.

### Why environments instead of substitution?

The operational semantics passed around environments that associated
values with variables and used the
function to find the value for a variable.
The more common alternative is to perform substitution, that is,
during a function call, to go through the body of the function
replacing occurrences of the parameters with their arguments.
I'll probably get flamed for saying this, but substitution has
a lot of disadvantages compared to environments. First, the
substitution function is difficult to define properly.
Second, it requires that you prove that substitution is
type preserving, which is rather tedious because it touches
every kind of expression (or statement) in the language.
In comparison, the lemma that we proved about the
function
didn't mention any expressions.
Third, substitution is really slow. Technically, that doesn't
effect the complexity of the proof, but it does affect your
ability to *test* your theorems. As I mentioned in the
Work Horse post, it's a good idea to implement
an interpreter and test whether your theorems hold on lots of example
programs before trying to prove your theorems. If your interpreter
uses substitution, it will be very very slow. You might instead
use environments, but keep substitution in your formal semantics,
but then your interpreter and your formal semantics differ so that
properties of one may not imply properties of the other.

### Why functions with names (for recursion) instead of fix?

The only way that I know of to implement fix, the fixpoint combinator, is with substitution, and we don't want substitution.

Also, I should highlight how we handle recursion in the operational
semantics. First, the
function doesn't
do much, it just captures the current environment. An alternative
used by Milner and Tofte is to extend the environment with
a binding for the function itself. This alternative makes the proof
of type safety more challenging because, instead of using induction,
you have to use coinduction. (See their paper *Co-induction
in relational semantics*.) While coinduction is a beautiful
concept, it's nice to be able to stick to good old induction.
Now, because doesn't
do much, we need to make up for it in the transition relation.
Notice that the transition for function call extends the environment
with the function itself. I forget where I learned this trick.
If you know of any references for this, please let me know!

### Conclusion

I think that's it for the important design decisions, but I may be forgetting something. If you have any questions regarding the rationale for anything else, please post a comment! So to sum up, if you want your type safety proof to be less tedious, I recommend using A-normal form and formulating your semantics as an environment-passing abstract machine. Oh, and if you want recursive functions, than extend your lambdas with a name and be lazy about extending the environment: wait until the function call to add the function itself to the environment.

+1. Just wrote a big-step semantics using environments, very much for the same reasons as you do. I am not sure why that's unusual: I'm also using CBV and de Brujin indexes, and I ended up needing _NO_ lemmas about binding. Are there known objections to using environments in this fashion?

ReplyDeleteTo be sure I got it right, I related it to substitution-based semantics, but only on paper. I also had another reason for this check: I was using step-indexed logical relations, and to make sure everything works it helps to ensure that step-counts match with small-step operational semantics. I learned that trick from "Imperative self-adjusting computation", POPL 2008 (for this insight, you can skip everything about stores).

I'm aware of the objections to big-step semantics, but luckily I was dealing with simply-typed lambda calculus, and learned logical relations already, so I could "just" prove the semantics is complete via logical relations :-).