Jeremy Siek: My new favorite abstract machine: ECD on ANF

A few years ago I blogged about my favorite abstraction machine, the ECD machine. It is a variant of Landin's classic SECD machine that uses evaluation contexts to handle control flow within an expression. Having received my Ph.D. from Indiana, evaluation contexts are very close to my heart, however, I'm having second thoughts.

The question that started these doubts was the following:

Should control flow within an expression be part of the static or dynamic semantics of a language?

The knee-jerk answer is: of course, control flow belongs in the dynamic semantics! But then I put on my compiler-writer hat and retort: but the control-flow within an expression can be compiled away. The standard conversion to A-normal form (ANF), which flattens expressions into statements, determines intra-expression control flow statically. So taking this answer to heart, I tried out several Isabelle formalizations using this approach. Low and behold, it worked quite well, removing the need for the machinery and tedious lemmas regarding evaluation contexts.

I'll show the main points of this idea in the setting of the Simply-Typed Lambda Calculus (STLC).

Conversion to A-Normal Form

First, we'll define the ANF variant of STLC, then give the conversion to ANF. The main changes are that function application becomes a statement and there is a statement for returning from a function. Also, note that the body of a lambda is now a statement instead of an expression. $\begin{array}{lrcl} \text{expressions} & e & ::= & c \mid x \mid \lambda x{:}T. s \\ \text{definitions} & d & ::= & x = e(e) \mid x = \mathit{op}(\overline{e}) \\ \text{statements} & s & ::= & d; s \mid \mathbf{return}\, e \end{array}$

Now for the conversion to ANF. $\mathit{ToANF(e)} = \mathrm{let}\,(e',\overline{d}) = \mathit{ANF}(e) \,\mathrm{in}\, (\overline{d}; \mathbf{return}\, e')$ $\begin{align*} \mathit{ANF}(c) &= (c,\epsilon) \\ \mathit{ANF}(\mathit{op}(\overline{e})) &= \mathrm{let}\, (\overline{e}',\overline{d}) = \overline{\mathit{ANF}}(\overline{e}),\\ &\qquad\,\, x = \mathit{fresh{-}variable}() \\ &\quad \mathrm{in}\, (x, (\overline{d}; x = \mathit{op}(\overline{e}'))) \\ \mathit{ANF}(x) &= (x,\epsilon) \\ \mathit{ANF}(\lambda x{:}T.e) &= \mathrm{let}\, (e',\overline{d}) = \mathit{ANF}(e)\\ &\quad \mathrm{in}\, (\lambda x{:}T. (\overline{d}; \mathbf{return}\, e'), \epsilon)\\ \mathit{ANF}(e_1\,e_2) &= \mathrm{let}\, (e'_1,\overline{d}_1) = \mathit{ANF}(e_1), \\ & \qquad\,\, (e'_2,\overline{d}_2) = \mathit{ANF}(e_2) \\ & \qquad\,\, x = \mathit{fresh{-}variable}() \\ & \quad \mathrm{in}\, (x, (\overline{d}_1;\overline{d}_2; x = e'_1(e'_2))) \end{align*}$ This conversion to ANF specifies that, within an expression, evaluation proceeds from left to right.

The ECD on ANF

The machine definition consists of two parts, a function for evaluating expressions to values and the transition rules for the machine itself, which handle the statements. We start with evaluating expressions. The evaluation function V takes two arguments, an expression and an environment mapping variables to values. $\begin{align*} V(c,\rho) &= c\\ V(x, \rho) &= \rho(x) \\ V(\lambda x{:}T.s, \rho) &= \langle \lambda x{:}T.s, \rho \rangle \end{align*}$

The machine state has the form $(s,\rho,\kappa)$ , where the stack $\kappa$ is a list of frames, which have the form $(x,s,\rho)$ . Now for the transition rules. There are three, one for each kind of statement. $\begin{align*} (x = e_1(e_2); s, \rho, \kappa) & \longmapsto (s', \rho'[y{:=}v_2], (x,s,\rho)::\kappa) \\ \text{where} & V(e_1,\rho) = \langle \lambda y{:}T. s', \rho' \rangle, V(e_2,\rho) = v_2 \\ (x = \mathit{op}(\overline{e}); s, \rho, \kappa) & \longmapsto (s, \rho[x{:=}v], \kappa) \\ \text{where}& v = \delta(\mathit{op},\overline{V}(\overline{e},\rho)) \\ (\mathbf{return}\,e, \rho, (x,s,\rho')::\kappa) & \longmapsto (s, [x{:=}v]\rho', \kappa) \\ \text{where}& V(e,\rho) = v \end{align*}$

To finish up, we define the semantics of the Simply-Typed Lambda Calculus using ToANF and the ECD machine. $\mathit{eval}(e) = \begin{cases} c & (\mathit{ToANF}(e), \emptyset, \epsilon) \longmapsto^{*} (\mathbf{return}\,e', \rho, \epsilon) \text{ and } V(e',\rho) = c \\ \bot & \text{otherwise} \end{cases}$

Discussion

While the ECD on ANF is my new favorite, I don't claim it's the best for all purposes. I suspect the original ECD machine, with evaluation contexts, may be better for explaining the semantics of the STLC to a programmer. Though it's unclear which is more difficult to explain, ANF or evaluation contexts. Also, programmers coming from C or Java may be more use to a language with statements.

I do claim the ECD on ANF is great for the purposes of doing PL meta-theory. For example, proving type safety is extremely straightforward.

Proof of type safety in Isabelle

Here's the proof of type safety for the STLC in A-normal form. The whole development is only 369 lines!

ECDOnANF.thy

Jeremy Siek

Thursday, July 12, 2012

My new favorite abstract machine: ECD on ANF

Conversion to A-Normal Form

The ECD on ANF

Discussion

Proof of type safety in Isabelle

No comments:

Post a Comment