Happy holidays! There’s nothing quite like curling up in a comfy chair on a rainy day and proving a theorem in your favorite proof assistant.
Lately I’ve been interested in graph models of the \(\lambda\)-calculus, that is, models that represent a \(\lambda\) with relations from inputs to outputs. The use of relations instead of functions is not a problem when reasoning about expressions that produce numbers, but it does introduce problems when reasoning about expressions that produce higher-order functions. Some of these expressions are contextually equivalent but not denotationally equivalent. For example, consider the following two expressions. \[{\lambda f.\,} (f {\;}0) + (f {\;}0) =_{\mathrm{ctx}} {\lambda f.\,} ({\lambda x.\,} x + x) {\;}(f {\;}0) \qquad\qquad (1)\] The expression on the left-hand side has two copies of a common subexpression \((f {\;}0)\). The expression on the right-hand side is optimized to have just a single copy of \((f {\;}0)\). The left and right-hand expressions in equation (1) are contextually equivalent because the \(\lambda\)-calculus is a pure language (no side effects), so whether we call \(f\) once or twice does not matter, and it always returns the same result given the same input. Unfortunately, the two expressions in equation (1) are not denotationally equivalent. \[{\mathcal{E}[\![ {\lambda f.\,} (f {\;}0) + (f {\;}0) ]\!]}\emptyset \neq {\mathcal{E}[\![ {\lambda f.\,} ({\lambda x.\,} x + x) {\;}(f {\;}0) ]\!]}\emptyset \qquad\qquad (2)\] Recall that my semantics \(\mathcal{E}\) maps an expression and environment to a set of values. The “set” is not because an expression produces multiple conceptually-different values. Sets are needed because we represent a (infinite) function as an infinite set of finite relations. So to prove the above inequality (2) we simply need to find a value that is in the set on the left-hand side that is not in the set on the right-hand side. The idea is that we consider the behavior when parameter \(f\) is bound to a relation that is not a function. In particular, the relation \[R = \{ (0,1), (0,2) \}\] Now when we consider the application \((f {\;}0)\), the semantics of function application given by \(\mathcal{E}\) can choose the result to be either \(1\) or \(2\). Furthermore, for the left-hand side of equation (2), it could choose \(1\) for the first \((f {\;}0)\) and \(2\) for the second \((f {\;}0)\) . Thus, the result of the function can be \(3\). \[\{ (R,3) \} \in {\mathcal{E}[\![ {\lambda f.\,} (f {\;}0) + (f {\;}0) ]\!]}\emptyset\] Of course, this function could never actually produce \(3\) because \(R\) does not correspond to any \(\lambda\)’s. In other words, garbage-in garbage-out. Turning to the right-hand side of equation (2), there is only one \((f{\;}0)\), which can either produce \(1\) or \(2\), so the result of the outer function can be \(2\) or \(4\), but not \(3\).
\[\begin{aligned} \{ (R,2) \} &\in {\mathcal{E}[\![ {\lambda f.\,} ({\lambda x.\,} x + x) {\;}(f {\;}0) ]\!]}\emptyset\\ \{ (R,3) \} &\notin {\mathcal{E}[\![ {\lambda f.\,} ({\lambda x.\,} x + x) {\;}(f {\;}0) ]\!]}\emptyset\\ \{ (R,4) \} &\in {\mathcal{E}[\![ {\lambda f.\,} ({\lambda x.\,} x + x) {\;}(f {\;}0) ]\!]}\emptyset\end{aligned}\]
So we need to put the function back in \(\lambda\)! That is, we need to restrict the notion of values so that all the relations are also functions. Recall the definition: a function \(f\) is a relation on two sets \(A\) and \(B\) such that for all \(a \in A\) there exists a unique \(b \in B\) such that \((a,b) \in f\). In other words, if \((a,b) \in f\) and \((a,b') \in f\), then necessarily \(b = b'\). Can we simply add this restriction to our notion of value? Not quite. If we literally applied this definition, we could still get graphs such as the following one, which maps two different approximations of the add-one function to different outputs. This graph does not correspond to any \(\lambda\). \[\{ (\{(0,1)\}, 2), (\{(0,1),(5,6) \}, 3) \}\]
So we need to generalize the notion of function to allow for differing approximations. We shall do this by generalizing from equality to consistency, written \(\sim\). Two integers are consistent when they are equal. Two graphs as consistent when they map consistent inputs to consistent outputs. We are also forced to explicitly define inconsistency, which we explain below.
\[\begin{gathered} \frac{}{n \sim n} \qquad \frac{\begin{array}{l}\forall v_1 v'_1 v_2 v'_2, (v_1,v'_1) \in t_1 \land (v_2,v'_2) \in t_2 \\ \implies (v_1 \sim v_2 \land v'_1 \sim v'_2) \lor v_1 \not\sim v_2 \end{array}} {t_1 \sim t_2} \\[2ex] \frac{n_1 \neq n_2}{n_1 \not\sim n_2} \qquad \frac{(v_1,v'_1) \in t_1 \quad (v_2,v'_2) \in t_2 \quad v_1 \sim v_2 \quad v'_1 \not\sim v'_2} {t_1 \not\sim t_2} \\[2ex] \frac{}{n \not\sim t} \qquad \frac{}{t \not\sim n}\end{gathered}\]
The definition of consistency is made a bit more complicated than I expected because the rules of an inductive definition must be monotonic, so we can’t negate a recursive application or put it on the left of an implication. In the above definition of consistency for graphs \(t_1 \sim t_2\), it would have been more natural to say \(v_1
\sim v_2 \implies v'_1 \sim v'_2\) in the premise, but then \(v_1 \sim
v_2\) is on the left of an implication. The above inductive definition works around this problem by mutually defining consistency and inconsistency. We then prove that inconsistency is the negation of consistency.
Proposition 1 (Inconsistency) \(v_1 \not\sim v_2 = \neg (v_1 \sim v_2)\)
Proof. We first establish by mutual induction that \(v_1 \sim v_2 \implies \neg (v_1 \not\sim v_2)\) and \(v_1 \not\sim v_2
\implies \neg (v_1 \sim v_2)\). We then show that \((v_1 \sim v_2) \lor
(v_1 \not\sim v_2)\) by induction on \(v_1\) and case analysis on \(v_2\). Therefore \(\neg (v_1 \not\sim v_2) \implies v_1 \sim v_2\), so we have proved both directions of the desired equality. \(\Box\)
Armed with this definition of consistency, we can define a generalized notion of function, let’s call it \(\mathsf{is\_fun}\). \[\mathsf{is\_fun}\;t \equiv \forall v_1 v_2 v'_1 v'_2, (v_1,v'_1) \in t \land (v_2,v'_2) \in t \land v_1 \sim v_2 \implies v'_1 \sim v'_2\] Next we restrict the notion of value to require the graphs to satisfy \(\mathsf{is\_fun}\). Recall that we use to define values by the following grammar. \[\begin{array}{lrcl} \text{numbers} & n & \in & \mathbb{Z} \\ \text{graphs} & t & ::= & \{ (v_1,v'_1), \ldots, (v_n,v'_n) \}\\ \text{values} & v & ::= & n \mid t \end{array}\] We keep this definition but add an induction definition of a more refined notion of value, namely \(\mathsf{is\_val}\). Numbers are values and graphs are values so long as they satisfy \(\mathsf{is\_fun}\) and only map values to values.
\[\begin{gathered} \frac{}{\mathsf{is\_val}\,n} \qquad \frac{\mathsf{is\_fun}\;t \quad \forall v v', (v,v') \in t \implies \mathsf{is\_val}\,v \land \mathsf{is\_val}\,v'} {\mathsf{is\_val}\,t}\end{gathered}\]
We are now ready to update our semantic function \(\mathcal{E}\). The one change that we make is to require that each graph \(t\) satisfies \(\mathsf{is\_val}\) in the meaning of a \(\lambda\). \[{\mathcal{E}[\![ {\lambda x.\,} e ]\!]}\rho = \{ t \mid \mathsf{is\_val}\;t \land \forall (v,v')\in t, v' \in {\mathcal{E}[\![ e ]\!]}\rho(x{:=}v) \}\] Hopefully this change to the semantics enables a proof that \(\mathcal{E}\) is deterministic. Indeed, we shall show that if \(v \in {\mathcal{E}[\![ e ]\!]}\rho\) and \(v' \in {\mathcal{E}[\![ e ]\!]}\rho'\) for any suitably related \(\rho\) and \(\rho'\), then \(v \sim v'\).
To relate \(\rho\) and \(\rho'\), we extend the definitions of consistency and \(\mathsf{is\_val}\) to environments.
\[\begin{gathered} \emptyset \sim \emptyset \qquad \frac{v \sim v' \quad \rho \sim \rho'} {\rho(x{:=}v) \sim \rho'(x{:=}v')} \\[2ex] \mathsf{val\_env}\;\emptyset \qquad \frac{\mathsf{is\_val}\; v \quad \mathsf{val\_env}\;\rho} {\mathsf{val\_env}\;\rho(x{:=}v)}\end{gathered}\]
We will need a few small lemmas concerning these definitions and their relationship with the \(\sqsubseteq\) ordering on values.
Proposition 2
If \(\mathsf{val\_env}\;\rho\) and \(\rho(x) = v\), then \(\mathsf{is\_val}\; v\).
If \(\rho \sim \rho'\), \(\rho(x) = v\), \(\rho'(x) = v'\), then \(v \sim v'\).
Proposition 3
If \(\mathsf{is\_val}\;v'\) and \(v \sqsubseteq v'\), then \(\mathsf{is\_val}\; v\).
If \(v_1 \sqsubseteq v'_1\), \(v_2 \sqsubseteq v'_2\), and \(v'_1 \sim v'_2\), then \(v_1 \sim v_2\).
We now come to the main theorem, which is proved by induction on \(e\), using the above three propositions.
Theorem (Determinism of \(\mathcal{E}\)) If \(v \in {\mathcal{E}[\![ e ]\!]}\rho\), \(v' \in {\mathcal{E}[\![ e ]\!]}\rho'\), \(\mathsf{val\_env}\;\rho\), \(\mathsf{val\_env}\;\rho'\), and \(\rho \sim \rho'\), then \(\mathsf{is\_val}\;v\), \(\mathsf{is\_val}\;v'\), and \(v \sim v'\).
No comments:
Post a Comment