Friday, March 24, 2017

Consolidation of the Denotational Semantics and an Application to Compiler Correctness

This is a two part post. The second part depends on the first.

Part 1. Consolidation of the Denotational Semantics

As a matter of expediency, I've been working with two different versions of the intersection type system upon which the denotational semantics is based, one version with subsumption and one without. I had used the one with subsumption to prove completeness with respect to the reduction semantics whereas I had used the one without subsumption to prove soundness (for both whole programs and parts of programs, that is, contextual equivalence). The two versions of the intersection type system are equivalent. However, it would be nice to simplify the story and just have one version. Also, while the correspondence to intersection types has been enormously helpful in working out the theory, it would be nice to have a presentation of the semantics that doesn't talk about them and instead talks about functions as tables.

Towards these goals, I went back to the proof of completeness with respect to the reduction semantics and swapped in the "take 3" semantics. While working on that I realized that the subsumption rule was almost admissible in the "take 3" semantics, just the variable and application equations needed more uses of \(\sqsubseteq\). With those changes in place, the proof of completeness went through without a hitch. So here's the updated definition of the denotational semantics of the untyped lambda calculus.

The definition of values remains the same as last time: \[ \begin{array}{lrcl} \text{function tables} & T & ::= & \{ v_1\mapsto v'_1,\ldots,v_n\mapsto v'_n \} \\ \text{values} & v & ::= & n \mid T \end{array} \] as does the \(\sqsubseteq\) operator. \begin{gather*} \frac{}{n \sqsubseteq n} \qquad \frac{T_1 \subseteq T_2}{T_1 \sqsubseteq T_2} \end{gather*} For the denotation function \(E\), we add uses of \(\sqsubseteq\) to the equations for variables (\(v \sqsubseteq \rho(x)\)) and function application (\(v_3 \sqsubseteq v_3'\)). (I've also added the conditional expression \(\mathbf{if}\,e_1\,e_2\,e_3\) and primitive operations on numbers \(f(e_1,e_2)\), where \(f\) ranges over binary functions on numbers.) \begin{align*} E[\!| n |\!](\rho) &= \{ n \} \\ E[\!| x |\!](\rho) &= \{ v \mid v \sqsubseteq \rho(x) \} \\ E[\!| \lambda x.\, e |\!](\rho) &= \left\{ T \middle| \begin{array}{l} \forall v_1 v_2'. \, v_1\mapsto v_2' \in T \Rightarrow\\ \exists v_2.\, v_2 \in E[\!| e |\!](\rho(x{:=}v_1)) \land v_2' \sqsubseteq v_2 \end{array} \right\} \\ E[\!| e_1\;e_2 |\!](\rho) &= \left\{ v_3 \middle| \begin{array}{l} \exists T v_2 v_2' v_3'.\, T {\in} E[\!| e_1 |\!](\rho) \land v_2 {\in} E[\!| e_2 |\!](\rho) \\ \land\, v'_2\mapsto v_3' \in T \land v'_2 \sqsubseteq v_2 \land v_3 \sqsubseteq v_3' \end{array} \right\} \\ E[\!| f(e_1, e_2) |\!](\rho) &= \{ f(n_1,n_2) \mid \exists n_1 n_2.\, n_1 \in E[\!| e_1 |\!](\rho) \land n_2 \in E[\!| e_2 |\!](\rho) \} \\ E[\!| \mathbf{if}\,e_1\,e_2\,e_3 |\!](\rho) &= \left\{ v \, \middle| \begin{array}{l} v \in E[\!| e_2 |\!](\rho) \quad \text{if } n \neq 0 \\ v \in E[\!| e_3 |\!](\rho) \quad \text{if } n = 0 \end{array} \right\} \end{align*}

Here are the highlights of the results for this definition.

Proposition (Admissibility of Subsumption)
If \(v \in E[\!| e |\!] \) and \(v' \sqsubseteq v\), then \(v' \in E[\!| e |\!] \).

Theorem (Reduction implies Denotational Equality)

  1. If \(e \longrightarrow e'\), then \(E[\!| e |\!] = E[\!| e' |\!]\).
  2. If \(e \longrightarrow^{*} e'\), then \(E[\!| e |\!] = E[\!| e' |\!]\).

Theorem (Whole-program Soundness and Completeness)

  1. If \(v' \in E[\!| e |\!](\emptyset)\), then \(e \longrightarrow^{*} v\) and \(v' \in E[\!| v |\!](\emptyset)\).
  2. If \(e \longrightarrow^{*} v\), then \(v' \in E[\!| e |\!](\emptyset) \) and \(v' \in E[\!| v |\!](\emptyset) \) for some \(v'\).

Proposition (Denotational Equality is a Congruence)
For any context \(C\), if \(E[\!| e |\!] = E[\!| e' |\!]\), then \(E[\!| C[e] |\!] = E[\!| C[e'] |\!]\).

Theorem (Soundness wrt. Contextual Equivalence)
If \(E[\!| e |\!] = E[\!| e' |\!]\), then \(e \simeq e'\).

Part 2. An Application to Compiler Correctness

Towards finding out how useful this denotational semantics is, I've begun looking at using it to prove compiler correctness. I'm not sure exactly which compiler I want to target yet, but as a first step, I wrote a simple source-to-source optimizer \(\mathcal{O}\) for the lambda calculus. It performs inlining and constant folding and simplifies conditionals. The optimizer is parameterized over the inlining depth to ensure termination. We perform optimization on the body of a function after inlining, so this is a polyvariant optimizer. Here's the definition. \begin{align*} \mathcal{O}[\!| x |\!](k) &= x \\ \mathcal{O}[\!| n |\!](k) &= n \\ \mathcal{O}[\!| \lambda x.\, e |\!](k) &= \lambda x.\, \mathcal{O}[\!| e |\!](k) \\ \mathcal{O}[\!| e_1\,e_2 |\!](k) &= \begin{array}{l} \begin{cases} \mathcal{O}[\!| [x{:=}e_2'] e |\!] (k{-}1) & \text{if } k \geq 1 \text{ and } e_1' = \lambda x.\, e \\ & \text{and } e_2' \text{ is a value} \\ e_1' \, e_2' & \text{otherwise} \end{cases}\\ \text{where } e_1' = \mathcal{O}[\!|e_1 |\!](k) \text{ and } e_2' = \mathcal{O}[\!|e_2 |\!](k) \end{array} \\ \mathcal{O}[\!| f(e_1,e_2) |\!](k) &= \begin{array}{l} \begin{cases} f(n_1,n_2) & \text{if } e_1' = n_1 \text{ and } e_2' = n_2 \\ f(e_1',e_2') & \text{otherwise} \end{cases}\\ \text{where } e_1' = \mathcal{O}[\!|e_1 |\!](k) \text{ and } e_2' = \mathcal{O}[\!|e_2 |\!](k) \end{array} \\ \mathcal{O}[\!| \mathbf{if}\,e_1\,e_2\,e_3 |\!](k) &= \begin{array}{l} \begin{cases} e_2' & \text{if } e_1' = n \text{ and } n \neq 0 \\ e_3' & \text{if } e_1' = n \text{ and } n = 0 \\ \mathbf{if}\,e_1'\, e_2'\,e_3'|\!](k) & \text{otherwise} \end{cases}\\ \text{where } e_1' = \mathcal{O}[\!|e_1 |\!](k) \text{ and } e_2' = \mathcal{O}[\!|e_2 |\!](k)\\ \text{ and } e_3' = \mathcal{O}[\!|e_3 |\!](k) \end{array} \end{align*}

I've proved that this optimizer is correct. The first step was proving that it preserves denotational equality.

Lemma (Optimizer Preserves Denotations)
\(E(\mathcal{O}[\!| e|\!](k)) = E[\!|e|\!] \)
Proof
The proof is by induction on the termination metric for \(\mathcal{O}\), which is the lexicographic ordering of \(k\) then the size of \(e\). All the cases are straightforward to prove because Reduction implies Denotational Equality and because Denotational Equality is a Congruence. QED

Theorem (Correctness of the Optimizer)
\(\mathcal{O}[\!| e|\!](k) \simeq e\)
Proof
The proof is a direct result of the above Lemma and Soundness wrt. Contextual Equivalence. QED

Of course, all of this is proved in Isabelle. Here is the tar ball. I was surprised that this proof of correctness for the optimizer was about the same length as the definition of the optimizer!

No comments:

Post a Comment