The intuition for an efficient coercion composition function came from thinking about types, not coercions. We'll start with the UD blame tracking strategy and then later consider D. Also, for now we'll stick with lazy cast checking.
Consider the following sequence of casts:
\[
e : T_1 \Rightarrow^{\ell_1} T_2 \Rightarrow^{\ell_2} \cdots \Rightarrow^{\ell_{n-1}} T_n
\]
We'd like some way to summarize the sequence of types without loosing
any important information. That is, we'd like to come up with
something that can catch the same cast errors as the entire sequence,
blaming the appropriate label, but using less space. Imagine the \(n\)
types as a line of differently colored trees on the side of a road.
If you're next to the road, staring down the line of trees, you see
what looks like one tree with branches of many colors. Some of the
branches from further-away trees are hidden from view by closer trees,
but some are visible. Now, suppose we wanted to maintain the same view
from your standpoint, but save on water. We could replace the line of
trees with a single multi-colored tree that includes all the branches
visible to you. The figure below depicts
three differently-colored trees getting merged into a single
multi-color tree. The nodes without color should be considered
transparent.
Mapping this idea back to types, the colors are blame labels and transparent nodes are the type \(\star\). Because we need to color individual branches, we need blame labels on every internal node of a type. In particular, we need the notion of a labeled type: \[ \begin{array}{lrcl} \text{optional labels} & p,q & ::= & \epsilon \mid \ell \\ \text{labeled types} & P,Q & ::= & B^p \mid P \to^p Q \mid \star \mid (I^p; \bot^\ell) \end{array} \] (These labeled types are for the UD strategy. We'll discuss the labeled types for D later. Also, the labeled type \(I^p; \bot^\ell\) deserves some explanation, which we'll get to soon.) The \(\mathit{label}\) function returns the top-most label of a labeled type: \begin{align*} \mathit{label}(B^p) &= p \\ \mathit{label}(P \to^p Q) &= p \\ \mathit{label}(\star) &= \epsilon \\ \mathit{label}(I^p; \bot^\ell) &= p \end{align*}
We'll define a function for composing two labeled types \(P\) and \(Q\) to produce a new labeled type \(P'\), using semicolon as the syntax for this composition function: \[ P ; Q = P' \] We replace each cast with a threesome, that is, a cast annotated with a labeled type. The labeled type is computed by a simple function \(\mathcal{L}\) that we define below. \begin{align*} & e : T_1 \Rightarrow^\ell T_2 \text{ becomes } e : T_1 \overset{P}{\Longrightarrow} T_2 \\ & \text{ where } P = \mathcal{L}(T_1 \Rightarrow^\ell T_2) \end{align*} \begin{align*} \mathcal{L}(B \Rightarrow^\ell B) &= B \\ \mathcal{L}(\star \Rightarrow^\ell \star) &= \star \\ \mathcal{L}(B \Rightarrow^\ell \star) &= B \\ \mathcal{L}(\star \Rightarrow^\ell B) &= B^\ell \\ \mathcal{L}(T_1 \Rightarrow^\ell T_2) &= I ; \bot^\ell \qquad \text{where } I \sim T_1 \\ \mathcal{L}(T_1 \to T_2 \Rightarrow^\ell T_3 \to T_4) &= \mathcal{L}(T_3 \Rightarrow^\ell T_1) \to \mathcal{L}(T_2 \Rightarrow^\ell T_4) \\ \mathcal{L}(T_1 \to T_2 \Rightarrow^\ell \star) &= \mathcal{L}(\star \Rightarrow^\ell T_1) \to \mathcal{L}(T_2 \Rightarrow^\ell \star) \\ \mathcal{L}(\star \Rightarrow^\ell T_3 \to T_4) &= \mathcal{L}(T_3 \Rightarrow^\ell \star) \to^\ell \mathcal{L}(\star \Rightarrow^\ell T_4) \end{align*} A sequence of threesomes is compressed to a single threesome using the composition function: \begin{gather*} e : T_1 \overset{P_1}{\Longrightarrow} T_2 \overset{P_2}{\Longrightarrow} \cdots \overset{P_{n-1}}{\Longrightarrow} T_n \\ \text{becomes} \\ e : T_1 \overset{P}{\Longrightarrow} T_n \\ \text{where } P = P_1; P_2; \cdots; P_{n-1} \end{gather*}
Before we go into the details of the composition function, it helps to see how (well-formed) threesomes correspond to coercions in normal form. With this correspondence in place, we can use the coercion reduction rules to help guide the definition of threesome composition. The function \(\mathit{TC}\) defined below maps threesomes to coercions in normal form. This function is an isomorphism, so it's inverse maps normal coercions back to threesomes. \begin{align*} \mathit{TC}(B \overset{B}{\Longrightarrow} B) &= \iota_B \\ \mathit{TC}(\star \overset{\star}{\Longrightarrow}\star) &=\iota_\star \\ \mathit{TC}(\star \overset{B^\ell}{\Longrightarrow} B) &= B?^\ell \\ \mathit{TC}(B\overset{B}{\Longrightarrow} \star) &= B! \\ \mathit{TC}(\star \overset{B^\ell}{\Longrightarrow} \star) &= B?^\ell; B! \\ \mathit{TC}(T_1 \overset{I; \bot^\ell}{\Longrightarrow} T_2) &= \mathsf{Fail}^\ell \\ \mathit{TC}(T_1 \overset{I^{\ell_1}; \bot^{\ell_2}}{\Longrightarrow} T_2) &= I?^{\ell_1} ; \mathsf{Fail}^{\ell_2} \\ \mathit{TC} (T_1 \to T_2 \overset{P_1 \to P_2}{\Longrightarrow} T_3 \to T_4)&= \mathit{TC}(T_3 \overset{P_1}{\Longrightarrow} T_1) \to \mathit{TC}(T_2 \overset{P_2}{\Longrightarrow} T_4) \\ \mathit{TC} (\star \overset{P_1 \to^\ell P_2}{\Longrightarrow} T_3 \to T_4)&= (\star \to \star)?^\ell ; \mathit{TC}(T_3 \overset{P_1}{\Longrightarrow} \star) \to \mathit{TC}(\star \overset{P_2}{\Longrightarrow} T_4) \\ \mathit{TC} (T_1 \to T_2 \overset{P_1 \to P_2}{\Longrightarrow} \star)&= \mathit{TC}(\star \overset{P_1}{\Longrightarrow} T_1) \to \mathit{TC}(T_2 \overset{P_2}{\Longrightarrow} \star); (\star \to \star)! \\ \mathit{TC} (\star \overset{P_1 \to^\ell P_2}{\Longrightarrow} \star)&= (\star \to \star)?^\ell ; \mathit{TC}(\star \overset{P_1}{\Longrightarrow} \star) \to \mathit{TC}(\star \overset{P_2}{\Longrightarrow} \star); (\star \to \star)! \end{align*}
We're ready to make precise how two labeled types can be composed to form a single labeled type. The two cases in which one of the labeled types is \(\star\) are easy: just return the other type: \begin{align*} \star; Q &= Q \\ P; \star &= P \end{align*} Next, suppose we have \(\mathsf{Int}^{\ell_1}\) followed by \(\mathsf{Int}^{\ell_2}\). These should compose to \(\mathsf{Int}^{\ell_1}\) because if the first cast succeeds, so will the second, making the blame label \(\ell_2\) redundant. In general, for labeled basic types we have the following rule. \begin{equation} \label{eq:1} B^p; B^q = B^p \end{equation} Suppose instead that the basic types don't match. The right-hand side of the following rule is a bit tricky, so let's think about this in terms of coercions. \begin{equation} \label{eq:2} B_1^p ; B_2^q = B_1^p ; \bot^q \qquad \text{if } B_1 \neq B_2 \end{equation} Suppose \(p=\ell_1, q = \ell_2\) and these two labeled types come from the threesomes \[ \star \overset{B_1^{\ell_1}}{\Longrightarrow} \star \overset{B_2^{\ell_2}}{\Longrightarrow} B_2 \] The corresponding coercion sequence is \[ B_1?^{\ell_1} ; B_1! ; B_2?^{\ell_2} \] which reduces to \[ B_1?^{\ell_1} ; \mathsf{Fail}^{\ell_2} \] and that corresponds to the labeled type for errors, \(B_1^{\ell_1}; \bot^{\ell_2}\). We also need to consider a mismatch between basic types and function types: \begin{align} B^p; (P \to^q Q) &= B^q; \bot^q \\ (P \to^p Q); B^q &= (\star \to \star)^p; \bot^q \end{align} The rule for labeled function types takes the label \(p\) for the label in the result and it recursively composes the domain and codomain types. The contra-variance in the parameter type is important for getting the right blame and coincides with the contra-variance in the reduction rule for composing function coercions. \begin{equation} \label{eq:4} (P_1 \to^p P_2) ; (Q_1 \to^q Q_2) = (Q_1; P_1) \to^p (P_2; Q_2) \end{equation} The following figure shows an example similar to the previous figure, but with function types instead of pair types. The analogy with real trees and line-of-sight breaks down because you have to flip to looking at the trees from back-to-front instead of front-to-back for negative positions within the type.
Lastly we need several rules to handle when the error type is on the left or right. \begin{align} (I^p; \bot^\ell); Q &= (I^p; \bot^\ell) \\ P; (I^q; \bot^\ell) &= I^p ; \bot^\ell \qquad \text{if } I \sim P \text{ and } \mathit{label}(P) = p \\ P; (I^q; \bot^\ell) &= I^p ; \bot^q \qquad \text{if } I \not\sim P \text{ and } \mathit{label}(P) = p \end{align}
What's extra cool about labeled types and their composition function is that each rule covers many different rules if you were to formulate them in terms of coercions. For example, the single rule \( B^p; B^q = B^p\) covers four situations when viewed as threesomes or coercions: \begin{align*} (B \overset{B}{\Longrightarrow} B; B \overset{B}{\Longrightarrow} B) &= B \overset{B}{\Longrightarrow} B \\ \iota_B; \iota_B &\longrightarrow \iota_B \\ (\star \overset{B^\ell}{\Longrightarrow} B; B \overset{B}{\Longrightarrow} B) &= \star \overset{B^\ell}{\Longrightarrow} B \\ B?^\ell ; \iota_B &\longrightarrow B?^\ell \\ (B \overset{B}{\Longrightarrow} B; B \overset{B}{\Longrightarrow} \star) &= B \overset{B}{\Longrightarrow} \star \\ \iota_B; B! &\longrightarrow B! \\ (\star \overset{B^\ell}{\Longrightarrow} B; B \overset{B}{\Longrightarrow} \star) &= \star \overset{B^\ell}{\Longrightarrow} \star \\ B?^\ell ; B! & \text{ is already in normal form} \end{align*}
We define a threesomes as a source type, middle labeled typed, and a target type. \[ \begin{array}{llcl} \text{threesomes} & \tau & ::= & T \overset{P}{\Longrightarrow} T \\ \end{array} \] We define the sequencing of threesomes as follows \[ (T_1 \overset{P}{\Longrightarrow} T_2); (T_2 \overset{Q}{\Longrightarrow}T_3) = T_1 \overset{P;Q}{\Longrightarrow} T_3 \] Similarly, we define the notation \(\tau_1 \to \tau_2\) as \[ (T_3 \overset{P}{\Longrightarrow} T_1) \to (T_2 \overset{Q}{\Longrightarrow} T_4) = T_1\to T_2 \overset{P\to Q}{\Longrightarrow} T_3 \to T_4 \]
We can now go back to the ECD machine and replace the coercions with threesomes. Here's the syntax in A-normal form. \[ \begin{array}{llcl} \text{expressions} & e & ::= & k \mid x \mid \lambda x{:}T.\, s \\ \text{definitions} & d & ::= & x=\mathit{op}(e) \mid x : T = e(e) \mid x = e : \tau \\ \text{statements} & s & ::= & d; s \mid \mathbf{return}\,e \mid \mathbf{return}\,e(e) : \tau \\ \text{simple values} & \tilde{v} & ::= & k \mid \langle \lambda x{:}T.\, s, \rho \rangle \\ \text{values}& v & ::= & \tilde{v} \mid \tilde{v} : \tau \end{array} \] The cast function, of course, needs to change. \begin{align*} \mathsf{cast}(k, \tau) &= \begin{cases} k & \text{if } \tau = (B \overset{B}{\Longrightarrow} B) \\ \mathbf{blame}\,\ell & \text{if } \tau = B \overset{B^p;\bot^\ell}{\Longrightarrow} T\\ k : \tau & \text{otherwise} \end{cases} \\ \mathsf{cast}(\langle \lambda x{:}T.s,\rho \rangle, \tau) &= \langle \lambda x{:}T.s,\rho \rangle : \tau \\ \mathsf{cast}(k : \tau_1, \tau_2) &= \begin{cases} k & \text{if } (\tau_1;\tau_2) = B \overset{B}{\Longrightarrow} B \\ \mathbf{blame}\,\ell & \text{if } (\tau_1;\tau_2) = B \overset{B^p;\bot^\ell}{\Longrightarrow} T\\ k : (\tau_1;\tau_2) & \text{otherwise} \end{cases} \\ \mathsf{cast}(\langle \lambda x{:}T.s,\rho \rangle : \tau_1, \tau_2) &= \begin{cases} \langle \lambda x{:}T.s,\rho \rangle : (\tau_1;\tau_2)& \text{if } \mathit{middle}(\tau_1;\tau_2) \neq (I^p;\bot^\ell) \\ \mathbf{blame}\,\ell & \text{if } \mathit{middle}(\tau_1;\tau_2) = (I^p;\bot^\ell) \end{cases} \end{align*} And last but not least, here's the transitions for the ECD machine, but with threesomes instead of coercions. \begin{align*} (x : T_1 = e_1(e_2); s, \rho, \kappa) & \longmapsto (s', \rho'[y{:=}v_2], (T_1 \overset{T_1}{\Longrightarrow}T_1,(x,s,\rho)::\kappa)) \\ \text{where } & V(e_1,\rho) = \langle \lambda y{:}T_2. s', \rho' \rangle, V(e_2,\rho) = v_2 \\ (x : T_1 = e_1(e_2); s, \rho, \kappa) & \longmapsto (s', \rho'[y{:=}v'_2], (\tau_2,(x,s,\rho)::\kappa)) \\ \text{where } & V(e_1,\rho) = \langle \lambda y{:}T_2. s', \rho' \rangle : \tau_1 \to \tau_2, \\ & V(e_2,\rho) = v_2, \text{ and } \mathsf{cast}(v_2, \tau_1) = v'_2\\ (x = \mathit{op}(\overline{e}); s, \rho, \kappa) & \longmapsto (s, \rho[x{:=}v], \kappa) \\ \text{where }& v = \delta(\mathit{op},V(e,\rho)) \\ (x = e : \tau; s, \rho, \kappa) & \longmapsto (s, \rho[x{:=}v'], \kappa) \\ \text{where } & V(e,\rho) = v, \mathsf{cast}(v,\tau) = v' \\ (\mathbf{return}\,e, \rho, (\tau,(x,s,\rho')::\kappa)) & \longmapsto (s, [x{:=}v']\rho', \kappa) \\ \text{where }& V(e,\rho) = v, \mathsf{cast}(v,\tau) = v' \\ (\mathbf{return}\,e_1(e_2) : \tau_1, \rho, (\tau_2,\sigma)) & \longmapsto (s, \rho'[y{:=}v_2], ((\tau_1; \tau_2),\sigma)) \\ \text{where } & V(e_1,\rho) = \langle \lambda y{:}T. s, \rho' \rangle, V(e_2,\rho) = v_2 \\ (\mathbf{return}\,e_1(e_2) : \tau_1, \rho, (\tau_2,\sigma)) & \longmapsto (s, \rho'[y{:=}v'_2], (\tau_5,\sigma)) \\ \text{where } & V(e_1,\rho) = \langle \lambda y{:}T_1. s, \rho' \rangle : \tau_3 \to \tau_4,\\ & V(e_2,\rho) = v_2, \mathsf{cast}(v_2, \tau_3) = v'_2, \text{ and} \\ & (\tau_4; \tau_1; \tau_2) = \tau_5 \\[2ex] (x = e : \tau; s, \rho, \kappa) & \longmapsto \mathbf{blame}\,\ell\\ \text{where } & V(e,\rho) = v, \mathsf{cast}(v,\tau) = \mathbf{blame}\,\ell \\ (\mathbf{return}\,e, \rho, (\tau,(x,s,\rho')::\kappa)) & \longmapsto \mathbf{blame}\,\ell \\ \text{where }& V(e,\rho) = v, \mathsf{cast}(v,\tau) = \mathbf{blame}\,\ell \end{align*}
We now have an implementation of Lazy UD that is space efficient and relatively efficient in time as well. However, there is one nagging issue regarding the speed of statically-typed code. Notice how there are two transition rules for each kind of function call. The source of the problem is that there are two kinds of values that have function type, closures and closures wrapped in a threesome. In the next post I'll define a unified representation for closures and wrapped closures so that we don't need to dispatch at runtime.
No comments:
Post a Comment