Back in Part 1 of this series, I mentioned that there is a design choice between eager and lazy cast checking. Recall the following example. \begin{align*} & \mathsf{let}\, f = (\lambda x:\mathsf{Int}. \,\mathsf{inc}\,x) : \mathsf{Int}\to\mathsf{Int} \Rightarrow^{\ell_0} \star \Rightarrow^{\ell_1} \mathsf{Bool}\to \mathsf{Bool}\\ & \mathsf{in} \, f\, \mathsf{true} \end{align*} With eager cast checking, the cast labeled \(\ell_1\) fails at the moment when it is applied to a value. Whereas with lazy cast checking, the \(\ell_1\) cast initially succeeds, but then later, when the function is applied at \(f\,\mathsf{true}\), the cast fails. I like eager cast checking because it tells the programmer as soon as possible that something is amiss. Further, it turns out that when using the space-efficient implementations, eager and lazy checking are about the same regarding run-time overhead. (Lazy can be faster if you don't care about space efficiency.)
We saw the specification for lazy cast checking in Part 1, but the specification for eager checking was postponed. The reason for the postponement was that specifying the semantics of eager cast checking requires more machinery than for lazy cast checking. (I'll expand on this claim in the next paragraph.) Thankfully, in the meantime we've acquired the necessary machinery: the Coercion Calculus. The Eager Coercion Calculus was first discussed in the paper Space-Efficient Gradual Typing and was extended to include blame labels in Exploring the Design Space of Higher-Order Casts. Here we'll discuss the version with blame labels and flesh out more of the theory, such as characterizing the coercion normal forms and defining an efficient method of composing coercions in normal form. This is based on an ongoing collaboration with Ronald Garcia.
Before getting into the eager coercion calculus, let me take some time to explain why the eager coercion calculus is needed for the semantics, not just for an efficient implementation. After all, there was no mention of coercions in the semantics of the lazy variants of the Gradually-Typed Lambda Calculus. Instead, those semantics just talked about casts, which consisted of a pair of types (source and target) and a blame label. The heart of those semantics was a \(\mathsf{cast}\) function that applies a cast to a value.
It's instructive to see where naive definitions of a \(\mathsf{cast}\) function for eager checking break down. The most obvious thing to try is to modify the \(\mathsf{cast}\) function to check for (deep) type consistency instead of only looking at the head of the type. So we change the first line of the \(\mathsf{cast}\) function from \[ \mathsf{cast}(v,T_1,\ell,T_2) = \mathbf{blame}\,\ell \qquad \text{if } \mathit{hd}(T_1) \not\sim \mathit{hd}(T_2) \] to \[ \mathsf{cast}(v,T_1,\ell,T_2) = \mathbf{blame}\,\ell \qquad \text{if } T_1 \not\sim T_2 \] Let's see what happens on an example just a tad different from the previous example. In this example we go through \(\star \to \star\) instead of \(\star\). \begin{align*} & \mathsf{let}\, f = (\lambda x:\mathsf{Int}. \,\mathsf{inc}\,x) : \mathsf{Int}\to\mathsf{Int} \Rightarrow^{\ell_0} \star \to \star \Rightarrow^{\ell_1} \mathsf{Bool}\to \mathsf{Bool}\\ & \mathsf{in} \, f\, \mathsf{true} \end{align*} With the naive cast function, both casts initially succeed, producing the value \[ (\lambda x:\mathsf{Int}. \,\mathsf{inc}\,x) : \mathsf{Int}\to\mathsf{Int} \Rightarrow^{\ell_0} \star \to \star \Rightarrow^{\ell_1} \mathsf{Bool}\to \mathsf{Bool} \] However, that's not what an eager cast checking should do. The above should not be a value, it should have already failed and blamed \(\ell_0\).
So the \(\mathsf{cast}\) function not only needs to check whether the target type is consistent with the source type, it also needs to check whether the target type is consistent with all of the casts that are wrapping the value. One way we could try to do this is to compute the greatest lower bound (with respect to naive subtyping) of all the types in the casts on the value, and then compare the target type to the greatest lower bound. The meet operator on types is defined as follows: \begin{align*} B \sqcap B &= B \\ \star \sqcap T &= T \\ T \sqcap \star &= T \\ (T_1 \to T_2) \sqcap (T_3 \to T_4) &= (T_1 \sqcap T_3) \to (T_2 \sqcap T_4) \\ & \text{if } (T_1 \to T_2) \sim (T_3 \to T_4)\\ T_1 \sqcap T_2 &= \bot & \text{if } T_1 \not\sim T_2 \end{align*} We introduce the bottom type \(\bot\) so that the meet operator can be a total function. Next we define a function that computes the meet of all the casts wrapping a value. \begin{align*} \sqcap (s : T_1 \Rightarrow^{\ell} T_2) &= T_1 \sqcap T_2 \\ \sqcap (v : T_1 \Rightarrow^{\ell_1} T_2 \Rightarrow^{\ell_1} T_3) & = (\sqcap (v : T_1 \Rightarrow^{\ell_1} T_2)) \sqcap T_3 \end{align*} Now we can replace the first line of the \(\mathsf{cast}\) function to use this meet operator. \begin{align*} \mathsf{cast}(v,T_1,\ell,T_2) &= \mathbf{blame}\,\ell \qquad \text{if } \left(\sqcap v\right) \not\sim T_2 \end{align*} How does this version fare on our example? An error is now triggered when the value flows into the cast labeled \(\ell_1\), so that's good, but the blame goes to \(\ell_1\). Unfortunately, the prior work on eager checking based on coercions says that \(\ell_0\) should be blamed instead! The problem with this version of \(\mathsf{cast}\) is that the \(\sqcap\) operator forgets about all the blame labels that are in the casts wrapping the value. In this example, it's dropping the label \(\ell_0\) which really ought to be blamed.
The Eager Coercion Calculus
In the context of the Coercion Calculus, one needs to add the following two reduction rules to obtain eager cast checking. What these rules do is make sure that failure coercions immediately bubble up to the top of the coercion where they can trigger a cast failure. \begin{align*} (\mathsf{Fail}^\ell \to c) &\longrightarrow \mathsf{Fail}^\ell \\ (\hat{c} \to \mathsf{Fail}^\ell) & \longrightarrow \mathsf{Fail}^\ell \end{align*} In the second rule, we require that the domain coercion be in normal form, thereby imposing a left-to-right ordering for coercion failures.
To ensure confluence, we also need to make two changes to existing reduction rules. In the rule for composing function coercions, we need to require that the two coercions be in normal form. (The notation \(\tilde{c}\) is new and will be explained shortly.) \begin{align*} (\tilde{c}_{11} \to \tilde{c}_{12}); (\tilde{c}_{21} \to \tilde{c}_{22}) & \longrightarrow (\tilde{c}_{21};\tilde{c}_{11}) \to (\tilde{c}_{12}; \tilde{c}_{22}) \end{align*} Here's the counter-example to confluence, thanks to Ron, if the above restriction is not made. \begin{align*} (\mathsf{Fail}^{\ell_1} \to c_1); (\mathsf{Fail}^{\ell_2} \to c_2) & \longrightarrow \mathsf{Fail}^{\ell_1}; (\mathsf{Fail}^{\ell_2}\to c_2) \longrightarrow \mathsf{Fail}^{\ell_1} \\ (\mathsf{Fail}^{\ell_1} \to c_1); (\mathsf{Fail}^{\ell_2} \to c_2) & \longrightarrow (\mathsf{Fail}^{\ell_2};\mathsf{Fail}^{\ell_1}) \to (c_1; c_2) \\ & \longrightarrow \mathsf{Fail}^{\ell_2} \to (c_1; c_2)\\ & \longrightarrow \mathsf{Fail}^{\ell_2} \end{align*} There is also a confluence problem regarding the following rule. \begin{align*} \overline{c} ; \mathsf{Fail}^\ell & \longrightarrow \mathsf{Fail}^\ell \end{align*} The counter-example, again thanks to Ron, is \begin{align*} (\iota \to \mathsf{Bool}!); (\iota \to \mathsf{Int}?^{\ell_2}); \mathsf{Fail}^{\ell_1} & \longrightarrow (\iota;\iota) \to (\mathsf{Bool}!; \mathsf{Int}?^{\ell_2}); \mathsf{Fail}^{\ell_1} \\ & \longrightarrow^{*} \iota \to (\mathsf{Fail}^{\ell_2}); \mathsf{Fail}^{\ell_1} \\ & \longrightarrow^{*} \mathsf{Fail}^{\ell_2} \\ (\iota \to \mathsf{Bool}!); (\iota \to \mathsf{Int}?^{\ell_2}); \mathsf{Fail}^{\ell_1} & \longrightarrow (\iota \to \mathsf{Bool}!); \mathsf{Fail}^{\ell_1} \\ & \longrightarrow \mathsf{Fail}^{\ell_1} \end{align*} We fix this problem by making the reduction rule more specific, by only allowing injections to be consumed on the left of a failure. \begin{align*} I! ; \mathsf{Fail}^\ell & \longrightarrow \mathsf{Fail}^\ell \end{align*}
Here's the complete set of reduction rules for the Eager Coercion Calculus. \begin{align*} I_1!; I_2?^\ell & \longrightarrow \mathcal{C}(I_1 \Rightarrow^\ell I_2) \\ (\tilde{c}_{11} \to \tilde{c}_{12}); (\tilde{c}_{21} \to \tilde{c}_{22}) & \longrightarrow (\tilde{c}_{21};\tilde{c}_{11}) \to (\tilde{c}_{12}; \tilde{c}_{22}) \\ \mathsf{Fail}^\ell; c & \longrightarrow \mathsf{Fail}^\ell \\ I! ; \mathsf{Fail}^\ell & \longrightarrow \mathsf{Fail}^\ell \\ (\mathsf{Fail}^\ell \to c) &\longrightarrow \mathsf{Fail}^\ell \\ (\tilde{c} \to \mathsf{Fail}^\ell) & \longrightarrow \mathsf{Fail}^\ell \end{align*}
These additions and changes to the reduction rules cause changes in
the normal forms for coercions. First, \(\mathsf{Fail}^\ell\) cannot
appear under a function coercion We therefore introduce another
category, called ``normal parts'' and written \(\tilde{c}\), that
excludes \(\mathsf{Fail}^\ell\) (but still includes \(I?^{\ell_1};
\mathsf{Fail}^{\ell_2}\) because the \(\ell_1\) projection could still
fail and take precedence over \(\ell_2\)). Also,
\( (\tilde{c}_1 \to \tilde{c}_2); \mathsf{Fail}^\ell\) is now a normal
form. Further, to regularize the form that coercions can take,
we always write them as having three parts.
The following grammar defines the normal coercions for eager
cast checking.
\[
\begin{array}{llcl}
\text{optional injections} & i & ::= & \iota \mid I! \\
& i_\bot & ::= & i \mid \mathsf{Fail}^\ell \\
\text{optional functions} & f & ::= &
\iota \mid \tilde{c} \to \tilde{c} \\
& f_\bot & ::= & f \mid \mathsf{Fail}^\ell \\
\text{optional projections} & j & ::= & \iota \mid I?^\ell \\
\text{wrapper coercions} & \overline{c} & ::= & \iota; f; i \qquad \dagger\\
\text{normal parts} & \tilde{c} & ::= & j ; f; i_\bot \qquad \ddagger \\
\text{normal coercions} & \hat{c} & ::= & \tilde{c} \mid \iota; \iota; \mathsf{Fail}^\ell
\end{array}
\]
\(\dagger\) The coercion \((\iota ;\iota; \iota)\) is not a wrapper coercion.
\(\ddagger\) The coercion \((\iota; \iota; \mathsf{Fail}^\ell)\) is not a normal part.
The Eager Gradually-Typed Lambda Calculus
Taking a step back, recall that we gave the semantics of the Lazy Gradually-Typed Lambda Calculus in terms of a denotational semantics, based on an evaluation function \(\mathcal{E}\). We can do the same for the Eager variant but using coercions to give the meaning of casts. The following is the definition of values and results for the Eager variant. \[ \begin{array}{lrcl} & F & \in & V \to_c R \\ \text{values} & v \in V & ::= & k \mid F \mid v : \overline{c} \\ \text{results}& r \in R & ::= &v \mid \mathbf{blame}\,\ell \end{array} \]
Most of the action in the \(\mathcal{E}\) function is in the \(\mathsf{cast}\) auxiliary function. We will give an alternative version of \(\mathsf{cast}\) for eager checking. To make \(\mathsf{cast}\) more succinct we make use of the following helper function regarding cast failure. \[ \mathsf{isfail}(c,\ell) \equiv (c = \mathsf{Fail}^\ell \text{ or } c = \mathsf{Fail}^\ell \circ (\tilde{c}_1 \to \tilde{c}_2) \text{ for some } \tilde{c}_1 \text{ and } \tilde{c}_2) \] Here's the updated definition of \(\mathsf{cast}\) for eager checking. \begin{align*} \mathsf{cast}(\tilde{v}, \hat{c}) &= \begin{cases} \tilde{v} & \text{if } \hat{c} = \iota \\ \mathbf{blame}\,\ell & \text{if } \mathsf{isfail}(\hat{c},\ell) \\ \tilde{v} : \hat{c} & \text{otherwise} \end{cases} \\ \mathsf{cast}(\tilde{v} : \overline{c_1}, \hat{c}_2) &= \begin{cases} \tilde{v} & \text{if } (\overline{c_1}; \hat{c}_2)= \iota \\ \mathbf{blame}\,\ell & \text{if } (\overline{c_1}; \hat{c}_2) \longrightarrow^{*} \hat{c}_3 \text{ and } \mathsf{isfail}(\hat{c}_3,\ell) \\ \tilde{v} : \overline{c_3} & \text{if } (\overline{c_1}; \hat{c}_2) \longrightarrow^{*} \overline{c}_3 \end{cases} \end{align*}
We can now give the definition of \(\mathcal{E}\), making use of the above \(\mathsf{cast}\) function as well as a function \(\mathcal{C}\) for compiling casts to coercions. (Use \(\mathcal{C}_{\mathit{D}}\) or \(\mathcal{C}_{\mathit{UD}}\) for \(\mathcal{C}\) to obtain the D or UD blame tracking strategy.) \begin{align*} \mathcal{E}(k,\rho) &= \mathbf{return}\, k \\ \mathcal{E}(x,\rho) &= \mathbf{return}\, \rho(x) \\ \mathcal{E}(\lambda x{:}T.\,e, \rho) &= \mathbf{return}\, (\lambda v.\, \mathcal{E}(e, \rho[x\mapsto v])) \\ \mathcal{E}(\mathit{op}(e)) &= \mathbf{letB}\, X = \mathcal{E}(e,\rho) \,\mathbf{in}\, \delta(\mathit{op},X) \\ \mathcal{E}(e : T_1 \Rightarrow^\ell T_2) &= \mathbf{letB}\, X = \mathcal{E}(e,\rho) \,\mathbf{in}\, \mathsf{cast}(X, \mathcal{C}(T_1 \Rightarrow^\ell T_2)) \\ \mathcal{E}(e_1\,e_2) &= \mathbf{letB}\,X_1 = \mathcal{E}(e_1,\rho)\,\mathbf{in}\\ & \quad\; \mathbf{letB}\,X_2 = \mathcal{E}(e_2,\rho)\,\mathbf{in}\\ & \quad\; \mathsf{apply}(X_1,X_2) \end{align*}
The semantics for the Eager Gradually-Typed Lambda Calculus is defined by the following \(\mathit{eval}\) partial function. \[ \mathit{eval}(e) = \begin{cases} \mathit{observe(r)} & \text{if }\emptyset \vdash e \leadsto e' : T \text{ and } \mathcal{E}(e',\emptyset) = r \\ \bot & \text{otherwise} \end{cases} \] where \begin{align*} \mathit{observe}(k) &= k \\ \mathit{observe}(F) &= \mathit{function} \\ \mathit{observe}(v : \iota \circ (\hat{c}_1 \to \hat{c}_2) \circ \iota) &= \mathit{function} \\ \mathit{observe}(v : I! \circ \iota \circ \iota) &= \mathit{dynamic} \\ \mathit{observe}(\mathbf{blame}\,\ell) &= \mathbf{blame}\,\ell \end{align*}
An Eager Space-Efficient Machine
To obtain a space efficient machine for the Eager variant, we just plug the eager version of \(\mathsf{cast}\) into the lazy space-efficient machine.
An Eager Time-Efficient Machine
Recall that the lazy time-efficient machine used threesomes instead of coercions because we could define an efficient function for composing threesomes, whereas reducing coercions is a complex process. The natural thing to do here is to try and come up with an eager variant of threesomes and the composition function. The lazy threesomes were isomorphic to lazy coercions in normal form, and we already have the normal forms for eager coercions, so it should be straightforward to come up with eager threesomes. It is straightforward, but in this case nothing is gained; we just end up with a slightly different notation. The reason is that the normal forms for eager coercions are more complex. So we might as well stick with using the eager coercions.
However, the essential lesson from the threesomes is that we don't need to implement reduction on coercions, instead we just need to define a composition function that takes coercions in normal form. After thinking about this for a long time, trying lots of variants, we've come up with the definition shown below. (Here we use \(\rhd\) for composition. I'd prefer to use the fatsemi latex symbol, but it seems that is not available in MathJax.)
Composition of Normal Coercions: \( \hat{c} \rhd \hat{c}\)
\begin{align*}
(j; f; i_\bot) \rhd (j'; f'; i'_\bot) &=
\mathbf{case}\;i_\bot \rhd j'\;\mathbf{of}\\
& \qquad I! \Rightarrow j; f; (I! \rhd i'_\bot) \\
& \quad \mid I?^\ell \Rightarrow I?^\ell; f'; i'_\bot \\
& \quad \mid \mathsf{Fail}^\ell \Rightarrow j; f; \mathsf{Fail}^\ell \\
& \quad \mid c \Rightarrow
\mathbf{case}\;(f \rhd c) \rhd f' \;\mathbf{of}\\
& \qquad\qquad\quad \mathsf{Fail}^\ell \Rightarrow j; \iota; \mathsf{Fail}^\ell\\
& \qquad\quad\quad \mid c' \Rightarrow j; c'; i'_\bot
\end{align*}
\begin{align*}
\iota \rhd c &= c \\
c \rhd \iota &= c \\
I_1! \rhd I_2?^\ell &= \mathcal{C}(I_1 \Rightarrow^\ell I_2) \\
(\tilde{c}_1 \to \tilde{c}_2) \rhd (\tilde{c}_3 \to \tilde{c}_4) &=
(\tilde{c_3}\rhd \tilde{c_1}) \overset{\bullet}{\to} (\tilde{c}_2 \rhd \tilde{c}_4) \\
\mathsf{Fail}^\ell \rhd c &= \mathsf{Fail}^\ell \\
I! \rhd \mathsf{Fail}^\ell &= \mathsf{Fail}^\ell \\
\\
\tilde{c}_1 \overset{\bullet}{\to} \tilde{c}_2 &= \tilde{c}_1 \to \tilde{c}_2 \\
\mathsf{Fail}^\ell \overset{\bullet}{\to}\hat{c}_2 &= \mathsf{Fail}^\ell \\
\tilde{c}_1 \overset{\bullet}{\to}\mathsf{Fail}^\ell &= \mathsf{Fail}^\ell
\end{align*}
To obtain an eager, time-efficient machine, we just replace coercion reduction with coercion composition. \begin{align*} \mathsf{cast}(\tilde{v}, \hat{c}) &= \begin{cases} \tilde{v} & \text{if } \hat{c} = \iota \\ \mathbf{blame}\,\ell & \text{if } \mathsf{isfail}(\hat{c},\ell) \\ \tilde{v} : \hat{c} & \text{otherwise} \end{cases} \\ \mathsf{cast}(\tilde{v} : \overline{c_1}, \hat{c}_2) &= \begin{cases} \tilde{v} & \text{if } (\overline{c_1}; \hat{c}_2)= \iota \\ \mathbf{blame}\,\ell & \text{if } (\overline{c_1} \rhd \hat{c}_2) = \hat{c}_3 \text{ and } \mathsf{isfail}(\hat{c}_3,\ell) \\ \tilde{v} : \overline{c_3} & \text{if } (\overline{c_1} \rhd \hat{c}_2) = \overline{c}_3 \end{cases} \end{align*}