Saturday, March 27, 2021

Strongly Connected Components and Kosaraju's Algorithm

Some presentations of Kosaraju’s Algorithm don’t provide a detailed explanation of why the algorithm works. Here’s my attempt to explain it.

The story begins with depth-first search (DFS). To review, DFS goes deeper at each step, following an out-edge from the current vertex to a never-before-seen vertex. If there are no out-edges to never-before-seen vertices, then the search backtracks to the last visited vertex with out-edges to never-before-seen vertices and continues from there.

The following graph shows the result of DFS on a small graph. The edges traversed by the DFS are marked in green and form a depth-first tree.

Example of Depth First Search.

We can categorizes the edges of the graph with respect to the depth-first tree in the following way:

  • tree edge: an edge on the tree, e.g., g → c in the graph above.
  • back edge: an edge that connects a descendent to an ancestor with respect to the tree, e.g., f → g.
  • forward edge: an edge that connects an ancestor to a descendent wrt. the tree, e.g., f → e
  • cross edge: all other edges, e.g., k → l.

Graph with edges categorized by a Depth-First Search. The tree edges are in green, back edges in red, forward edges in blue, and cross edges in black.

Theorem A graph has a cycle if and only if there is a back edge.

As we shall see, it is useful to record timestamps on a vertex when it is first discovered during a DFS and when it is finished (after visiting all of its descendants).

A graph with discover and finish times from a Depth-First Search.


  • If u → v is a tree, forward or cross edge, then the finish_time[v] < finish_time[u].
  • If u → v is a back edge, then finish_time[u] < finish_time[v].

For reference, here is the code for DFS and its auxiliary function DFS_visit.

static void DFS_visit(List<List<Integer>> G, Integer u, 
                      ArrayList<Integer> parent,
                      ArrayList<Boolean> visited, List<Integer> finish) {
    visited.set(u, true);
    for (Integer v : G.get(u)) {
        if (! visited.get(v)) {
            parent.set(v, u);
            DFS_visit(G, v, parent, visited, finish);

static List<Integer> DFS(List<List<Integer>> G) {
    ArrayList<Integer> parent = new ArrayList<>();
    ArrayList<Boolean> visited = new ArrayList<>();
    for (int u = 0; u != G.size(); ++u) {
    ArrayList<Integer> finish = new ArrayList<>();
    for (int u = 0; u != G.size(); ++u) {
        visited.set(u, false);
    for (int u = 0; u != G.size(); ++u) {
        if (! visited.get(u))
            DFS_visit(G, u, parent, visited, finish);
    return finish;

Now we turn to discussing the problem of computing strongly connected components.

Definition A strongly connected component is a maximum subset of the vertices in a graph such that every vertex in the subset is reachable from all the other vertices in the subset.

For example, the following graph

has these strongly connected components:

Definition The component graph C of another digraph G has 1) a vertex for each SCC in G. For each vertex u in C, we write SCC(u) for it’s SCC in G. 2) an edge between u and v if there is an edge from any vertex in SCC(u) to any vertex in SCC(v).

Here is the component graph of the example.

Theorem A component graph is acyclic.

Otherwise, the vertices in the cycle represent connected components that are not maximal. They could have been combined into a larger SCC.

Kosaraju’s Algorithm for SCC

Suppose we do a DFS_visit from a random node in the graph G. We’ll visit all of the other nodes in its SCC (that’s good) but we may also visit nodes in other SCCs (that’s bad).

How can we cause DFS_visit to stop before visiting nodes in other SCCs?

If we run DFS_visit on a node in an SCC that has no out-edges to other SCCs, then we’d just visit the nodes in that SCC and no other. We could then remove those nodes from the graph and repeat.

That’s a lot like a topological ordering on the component graph C, but with out-edges instead of in-edges. So what we need is a topological ordering on the transposed component graph C^T.

Definition The transpose of a graph G, written G^T, has the same vertices as G but the edges are reversed.

For examaple,

D, B, A, E

is a topological ordering of C^T.

But we don’t have C yet… that’s what we’re trying to compute!

Recall that DFS finish times are related to topological ordering. We can apply DFS to G^T to obtain finish times. Here’s the transposed graph of the example with the root and edges of each DFS tree highlighted in green.

and here are the vertices ordered by finish time:

3, 0, 1, 5, 4, 2

The vertex that finished last (vertex 2) must be in a SCC (D) that does not have any in-edges in C^T. Why is that? If there were an in-edge from another SCC, then the source of that in-edge would have finished later, but that contradicts vertex 2 being the last to finish. The only way the source of the in-edge could finish earlier would be if it was a back edge, but then the two vertices would be in a cycle and in the same SCC, which contradicts them being in different SCC.

Since the SCC (D) does not have any in-edges in C^T, it doesn’t have any out-edges in the C.

So running DFS_visit on vertex 2 in the original graph will only reach other vertices in its SCC (D). DFS_visit will mark all of those vertices as visited, so later runs of DFS_visit will ignore them.

We continue running DFS_visit on each vertex according to the reverse order of finish time (i.e. 4,5,1,0,3). Each tree in the resulting DFS forest is a SCC.

So here’s the algorithm

  1. Transpose the graph G to obtain G^T.
  2. Apply DFS to G^T to obtain the order in which the vertices finished.
  3. For each vertex u in the reversed finish list, apply DFS_visit to u in G.
  4. Each of the resulting DFS trees is a SCC. (The trees are encoded in the parent array.)

(The above differs from the standard presentation of Kosaraju’s algorithm, which instead applies DFS to G to get an ordering, and then applies DFS_visit to G^T repeatedly to get the DFS forest in which each tree is an SCC.)

static ArrayList<Integer> SCC(List<List<Integer>> G) {
  List<List<Integer>> GT = transpose(G);
  List<Integer> finished = DFS(GT);

  ArrayList<Integer> parent = new ArrayList<>();
  ArrayList<Boolean> visited = new ArrayList<>();
  for (int u = 0; u != G.size(); ++u) {
  ArrayList<Integer> ignore = new ArrayList<>();
  for (Integer u : finished) {
    if (! visited.get(u))
      DFS_visit(G, u, parent, visited, ignore);
  return parent;

Friday, July 10, 2020

Type Safety in Two Easy Lemmas

Type Safety in Two Easy Lemmas

Type Safety in Two Easy Lemmas

Wow, it's been seven years already since I blogged about Type Safety in Three Easy Lemmas. Time flies! In that blog post I showed how to prove type safety of a simple language whose semantics was specified by a definitional interpreter. I still like that approach, and it has proved useful to other researchers on much larger projects such as the verified CakeML compiler.

In the meantime, I've learned about the Agda proof assistant thanks to the book Programming Language Foundations in Agda (PLFA) and I've become excited by Agda's abstraction mechanisms that enable proof reuse. I'm working on an Agda library for reusable programming language metatheory, called abstract-binding-trees. As the name suggests, it represents abstract syntax trees using Robert Harper's notion of abstract binding trees (ABT), that is, trees that are enhanced to know about variable bindings and variable occurrences (See the book Practical Foundations for Programming Languages). My library provides a suite of useful functions on abstract binding trees, such as substitution, and theorems about those functions. The neat thing about these theorems is that they automatically apply to any language whose grammar is built using abstract binding trees!

In this blog post I'll prove type safety of the simply-typed lambda calculus (STLC) with respect to a semantic specified in the standard way using a reduction semantics (standard for PL theory). The proof includes just two easy lemmas: progress and preservation. Normally a proof via progress and preservation also requires quite a few technical lemmas about substitution, but in this case we get those lemmas for free thanks to the abstract-binding-trees library.

This blog post is a literate Agda file, so the text will be interspersed with the Agda code that defines the STLC and proves type safety.

module examples.BlogTypeSafetyTwoEasy where

We'll be making use of the following items from the Agda standard library.

open import Data.List using (List; []; _∷_; length)
open import Data.Nat using (; zero; suc)
open import Data.Product using (_×_; proj₁; proj₂) renaming (_,_ to ⟨_,_⟩ )
open import Data.Unit.Polymorphic using (; tt)
open import Data.Vec using (Vec) renaming ([] to []̌; _∷_ to _∷̌_)
open import Relation.Binary.PropositionalEquality using (_≡_; refl; sym)

Syntax of the STLC

The abstract-binding-trees library provides a module named Syntax that provides facilities for creating abstract binding trees.

open import Syntax

An abstract binding tree ABT consists of two kinds of nodes:

  • Variables: A variable node is a leaf (no children) and stores the de Bruijn index for the variable.

  • Operators: An operator node is tagged with the kind of operator and it has zero or more children, depending on the kind of operator.

The ABT data type is parameterized by the kinds of operators and their signatures, which specifies things like the number of child nodes for each kind of operator. To specify the operators, you create a data type definition with one constructor for each kind of operator. For the STLC the operators are lambda abstraction and application.

data Op : Set where
  op-lam : Op
  op-app : Op

To specify the operator signatures, write a function that maps the operators to a list of the Sig data type. The length of the list says the number of children nodes and the Sig controls changes in variable scoping for the child. The Sig data type is defined by the abstract-binding-trees library as follows:

data Sig : Set where
  ■ : Sig
  ν : Sig → Sig
  ∁ : Sig → Sig

The ν brings a variable into scope. The clears the scope of the child, so that the child does not have access to the surrounding lexical scope. The terminates the changes in scope.

For the STLC, the signature function is defined as follows.

sig : Op  List Sig
sig op-lam = (ν )  []
sig op-app =     []

With Op and sig defined, we can import the abstract binding tree data type ABT from the Syntax library. We choose to rename it to Term.

open Syntax.OpSig Op sig renaming (ABT to Term)

The raw abstract binding trees are verbose to deal with, so we use Agda pattern synonyms to obtain syntax that is closer to the pen-and-paper STLC. We write ƛ N for a lambda abstraction with body N and we write L · M for the application of the function produces by L to the argument produced by M.

pattern ƛ N  = op-lam  cons (bind (ast N)) nil 

infixl 7  _·_
pattern _·_ L M = op-app  cons (ast L) (cons (ast M) nil) 

Reduction Semantics

We define the reduction semantics for the STLC in the usual way, with several congruence rules (the ξ's) and the β rule for function application. In the β rule, we use the substitution function defined in the abstract-binding-trees library, writing N [ M ] for replacing all the occurrences of de Bruijn index 0 inside N with the term M.

infix 2 _—→_

data _—→_ : Term  Term  Set where

  ξ-·₁ :  {L L′ M : Term}
     L —→ L′
     L · M —→ L′ · M

  ξ-·₂ :  {L M M′ : Term}
     M —→ M′
     L · M —→ L · M′

  ξ-ƛ :  {N N′ : Term}
     N —→ N′
     (ƛ N) —→ (ƛ N′)

  β-ƛ :  {N M : Term}
     (ƛ N) · M —→ N [ M ]

Type System

To make use of the theorems in the abstract-binding-trees library, we need to use its approach to defining type systems. Instead of defining the whole type system ourselves using an Agda data type, we instead specify 1) the types and 2) the side-conditions for each typing rule.

For STLC, we have function types, written A ⇒ B, and the bottom type Bot.

data Type : Set where
  Bot   : Type
  _⇒_   : Type  Type  Type

The library asks that we specify a side condition for the variable rule that mediates the variable's type A in the environment with the expected type B, for which we define the following predicate 𝑉. For the STLC we simply require that A ≡ B.

𝑉 : List Type  Var  Type  Type  Set
𝑉 Γ x A B = A  B

Next we define the predicate 𝑃 that specifies the side conditions for all the other syntax nodes. The definition of 𝑃 includes one line for each operator. The Vec parameter contains the types of the child nodes. The BTypes parameter contains the types of the bound variables. The last Type parameter is the type assigned to the current node. So for lambda abstractions (op-lam), the body has type B, the lambda's bound variable has type A, and we require that the type C of the lambda is a function type from A to B, that is, C ≡ A ⇒ B. For application (op-app), the function has type C, the argument has type A, and the result type is B provided that C is a function type from A to B, that is, C ≡ A ⇒ B.

𝑃 : (op : Op)  Vec Type (length (sig op))  BTypes Type (sig op)  Type  Set
𝑃 op-lam (B ∷̌ []̌)   A , tt  , tt  C = C  A  B
𝑃 op-app (C ∷̌ A ∷̌ []̌)  tt ,  tt , tt   B = C  A  B

We import the ABTPredicate module, using our definitions of 𝑉 and 𝑃, to obtain the type system for the STLC.

open import ABTPredicate Op sig 𝑉 𝑃

The raw typing rules are verbose, so we again use Agda's pattern synonyms to create abbreviations to match the rule names in PLFA.

pattern ⊢` ∋x = var-p ∋x refl
pattern ⊢ƛ ⊢N eq = op-p {op = op-lam} (cons-p (bind-p (ast-p ⊢N)) nil-p) eq
pattern ⊢· ⊢L ⊢M eq = op-p {op = op-app}
                           (cons-p (ast-p ⊢L) (cons-p (ast-p ⊢M) nil-p)) eq

Proof of Type Safety

We prove type safety with two lemmas: progress and preservation.

Proof of Progress

The progress lemma states that every closed, well-typed term is either a value (so it's finished computing) or it can reduce.

In the STLC, lambda abstractions are values.

data Value : Term  Set where

  V-ƛ :  {N : Term}
     Value (ƛ N)

Following PLFA, we define an auxiliary data type to express the conclusion of the progress lemma.

data Progress (M : Term) : Set where

  done :
      Value M
     Progress M

  step :  {N}
     M —→ N
     Progress M

The proof of progress is by induction on the typing derivation. The variable case is vacuous because M is closed (well typed in an empty environment). In the lambda case, we're done. Regarding an application L · M, the induction hypothesis tells us that term L either takes a step or is already a lambda abstraction. In the former case, the whole application reduces using the congruence rule ξ-·₁. In the later case, the whole application reduces using β reduction.

progress :  {M A}
   []  M  A
   Progress M
progress (⊢` ())
progress (⊢ƛ ⊢N _)                          =  done V-ƛ
progress (⊢· ⊢L ⊢M _)
    with progress ⊢L
... | step L—→L′                            =  step (ξ-·₁ L—→L′)
... | done V-ƛ                              =  step β-ƛ

As you can see, to prove progress we didn't need help from the abstract-binding-trees library.

Proof of Preservation

The preservation lemma says that if a well-typed term reduces to another term, then that term is also well typed. The proof is by induction on the derivation of the reduction. The only interesting case is the one for β reduction:

(ƛ N) · M —→ N [ M ]

We know that

(A ∷ Γ) ⊢ N ⦂ B
Γ ⊢ M ⦂ A

and we need prove that

Γ ⊢ N [ M ] ⦂ B

This requires the lemma that substitution preserves typing, which is provided in the SubstPreserve module of the abstract-binding-trees library. This module places four restrictions on 𝑉, for which we provide the proofs (λ x → refl), etc.

open import SubstPreserve Op sig Type 𝑉 𝑃  x  refl)  { refl refl  refl })
     x  x)  { refl ⊢M  ⊢M }) using (preserve-substitution)

So here is the proof of preservation.

preserve :  {Γ M N A}
   Γ  M  A
   M —→ N
   Γ  N  A
preserve (⊢· ⊢L ⊢M refl) (ξ-·₁ L—→L′) = ⊢· (preserve ⊢L L—→L′) ⊢M refl
preserve (⊢· ⊢L ⊢M refl) (ξ-·₂ M—→M′) = ⊢· ⊢L (preserve ⊢M M—→M′) refl
preserve (⊢ƛ ⊢M refl) (ξ-ƛ M—→N) = ⊢ƛ (preserve ⊢M M—→N) refl
preserve {M = (ƛ N) · M} (⊢· (⊢ƛ ⊢N refl) ⊢M refl) β-ƛ =
    preserve-substitution N M ⊢N ⊢M

Thus we conclude the proof of type safety, having only needed to prove two lemmas, progress and preservation. Thanks to the abstract-binding-trees library, we did not need to prove that substitution preserves types nor any of the many technical lemmas that it depends on.

Wednesday, September 12, 2018

Reading list for getting started on Gradual Typing

Which papers would I recommend for getting started on understanding the research on gradual typing? That's a hard question because there are a lot of papers to choose from and, as research papers, their primary goal was not to give a good introduction, but instead to describe some scientific contribution. I really ought to write a proper introduction, but in the mean time, here's my choice of a few papers to get started.
  1. Refined Criteria for Gradual Typing
    This paper does a decent job of surveying research related to gradual typing and situating it with respect to other areas of research in programming languages and type systems. The paper includes a modern and, what I would deem canonical, specification of the Gradually Typed Lambda Calculus (GTLC). Finally, the paper gives formal criteria for what it means for a language to be gradually typed, including the gradual guarantee.
  2. Blame and Coercion: Together Again for the First Time (alternative location)
    The runtime semantics of a gradually typed language is typically given in two parts: 1) a translation to a cast calculus and 2) an operational semantics for the cast calculus. Nowadays,  I recommend using coercions to express casts because they help to constrain the design space in a good way, they are easily extended to handle blame tracking, and they can be compressed to ensure space efficiency (time too!). This paper defines an easy-to-understand coercion calculus \(\lambda C\) and a space-efficient calculus \(\lambda S\), proves that they are equivalent to the standard cast calculus \(\lambda B\), and also reviews the blame safety theorem.
  3. Abstracting Gradual Typing (alternative location)
    This paper presents a general framework based on abstract interpretation for understanding gradual typing and for extending gradual typing to handle languages that make use of other predicates on types, such as subtyping. The framework provides guidance for how to define the consistency relation and for how to derive an operational semantics.
After reading the above papers, there's plenty more to enjoy! See the bibliography maintained by Sam Tobin-Hochstadt.

Thursday, August 09, 2018

Intersection Types, Sub-formula Property, and the Functional Character of the Lambda Calculus

Intersection Types, Sub-formula Property, and the Functional Character of the Lambda Calculus

Last December I proved that my graph model of the lambda calculus, once suitable restricted, is deterministic. That is, I defined a notion of consistency between values, written \(v_1 \sim v_2\), and showed that any two outputs of the same program are consistent.
Theorem (Determinism)
If \(v \in {\mathcal{E}{[\![ e ]\!]}}\rho\), \(v' \in {\mathcal{E}{[\![ e ]\!]}}\rho'\), and \(\rho \sim \rho'\), then \(v \sim v'\).
Recall that values are integers or finite relations; consistency for integers is equality and consistency for relations means mapping consistent inputs to consistent outputs. I then restricted values to be well formed, meaning that they must be consistent with themselves (and similarly for their parts).

Having proved the Determinism Theorem, I thought it would be straightforward to prove the following related theorem about the join of two values.
Theorem (Join)
If \(v \in {\mathcal{E}{[\![ e ]\!]}}\rho\), \(v' \in {\mathcal{E}{[\![ e ]\!]}}\rho'\), \(\rho\) is well formed, \(\rho'\) is well formed, and \(\rho \sim \rho'\),
then \(v \sqcup v' \in {\mathcal{E}{[\![ e ]\!]}}(\rho\sqcup\rho')\).
I am particularly interested in this theorem because \(\beta\)-equality can be obtained as a corollary. \[{\mathcal{E}{[\![ ({\lambda x.\,}e){\;}e' ]\!]}}\rho = {\mathcal{E}{[\![ [x{:=}e']e ]\!]}}\rho\] This would enable the modeling of the call-by-name \(\lambda\)-calculus and it would also enable the use of \(\beta\)-equality in a call-by-value setting when \(e'\) is terminating (instead of restricting \(e'\) to be a syntactic value).

Recall that we have defined a partial order \(\sqsubseteq\) on values, and that, in most partial orders, there is a close connection between notions of consistency and least upper bounds (joins). One typically has that \(v \sim v'\) iff \(v \sqcup v'\) exists. So my thinking was that it should be easy to adapt my proof of the Determinism Theorem to prove the Join Theorem, and I set out hoping to finish in a couple weeks. Hah! Here we are 8 months later and the proof is complete; it was a long journey that ended up depending on a result that was published just this summer, concerning intersection types, the sub-formula property, and cut elimination by Olivier Laurent. In this blog post I’ll try to recount the journey and describe the proof, hopefully remembering the challenges and motivations. Here is a tar ball of the mechanization in Isabelle and in pdf form.

Many of the challenges revolved around the definitions of \(\sqsubseteq\), consistency, and \(\sqcup\). Given that I already had definitions for \(\sqsubseteq\) and consistency, the obvious thing to try was to define \(\sqcup\) such that it would be the least upper bound of \(\sqsubseteq\). So I arrived at this partial function: \[\begin{aligned} n \sqcup n &= n \\ f_1 \sqcup f_2 &= f_1 \cup f_2\end{aligned}\] Now suppose we prove the Join Theorem by induction on \(e\) and consider the case for application: \(e = (e_1 {\;}e_2)\). From \(v \in {\mathcal{E}{[\![ e_1 {\;}e_2 ]\!]}}\) and \(v' \in {\mathcal{E}{[\![ e_1 {\;}e_2 ]\!]}}\) we have

  • \(f \in {\mathcal{E}{[\![ e_1 ]\!]}}\rho\), \(v_2 \in {\mathcal{E}{[\![ e_2 ]\!]}}\rho\), \(v_3 \mapsto v_4 \in f\), \(v_3 \sqsubseteq v_2\), and \(v \sqsubseteq v_4\) for some \(f, v_2, v_3, v_4\).

  • \(f' \in {\mathcal{E}{[\![ e_2 ]\!]}}\rho'\), \(v'_2 \in {\mathcal{E}{[\![ e_2 ]\!]}}\rho'\), \(v'_3 \mapsto v'_4 \in f\), \(v'_3 \sqsubseteq v'_2\), and \(v' \sqsubseteq v'_4\) for some \(f', v'_2, v'_3, v'_4\).

By the induction hypothesis we have \(f \sqcup f' \in {\mathcal{E}{[\![ e_1 ]\!]}}\) and \(v_2 \sqcup v'_2 \in {\mathcal{E}{[\![ e_2 ]\!]}}\). We need to show that \[v''_3 \mapsto v''_4 \in f \sqcup f' \qquad v''_3 \sqsubseteq v_2 \sqcup v'_2 \qquad v \sqcup v' \sqsubseteq v''_4\] But here we have a problem. Given our definition of \(\sqcup\) in terms of set union, there won’t necessarily be a single entry in \(f \sqcup f'\) that combines the information from both \(v_3 \mapsto v_4\) and \(v'_3 \mapsto v'_4\). After all, \(f \sqcup f'\) contains all the entries of \(f\) and all the entries of \(f'\), but the set union operation does not mix together information from entries in \(f\) and \(f'\) to form new entries.

Intersection Types to the Rescue

At this point I started thinking that my definitions of \(\sqsubseteq\), consistency, and \(\sqcup\) were too simple, and that I needed to incorporate ideas from the literature on filter models and intersection types. As I’ve written about previously, my graph model corresponds to a particular intersection type system, and perhaps a different intersection type system would do the job. Recall that the correspondence goes as follows: values correspond to types, \(\sqsubseteq\) corresponds to subtyping \(<:\) (in reverse), and \(\sqcup\) corresponds to intersection \(\sqcap\). The various intersection type systems primarily differ in their definitions of subtyping. Given the above proof attempt, I figured that I would need the usual co/contra-variant rule for function types and also the following rule for distributing intersections over function types. \[(A\to B) \sqcap (A \to C) <: A \to (B \sqcap C)\] This distributivity rule enables the “mixing” of information from two different entries.

So I defined types as follows: \[A,B,C,D ::= n \mid A \to B \mid A \sqcap B\] and defined subtyping according to the BCD intersection type system (Lambda Calculus with Types, Barendregt et al. 2013). \[\begin{gathered} A <: A \qquad \frac{A <: B \quad B <: C}{A <: C} \\[2ex] A \sqcap B <: A \qquad A \sqcap B <: B \qquad \frac{C <: A \quad C <: B}{C <: A \sqcap B} \\[2ex] \frac{C <: A \quad B <: D}{A \to B <: C \to D} \qquad (A\to B) \sqcap (A \to C) <: A \to (B \sqcap C)\end{gathered}\] I then adapted the definition of consistency to work over types. (Because this definition uses negation, it is easier to define consistency as a recursive function in Isabelle instead of as an inductively defined relation.) \[\begin{aligned} n \sim n' &= (n = n') \\ n \sim (C \to D) &= \mathit{false} \\ n \sim (C \sqcap D) &= n \sim C \text{ and } n \sim D \\ (A \to B) \sim n' &= \mathit{false} \\ (A \to B) \sim (C \to D) &= (A \sim C \text{ and } B \sim D) \text{ or } A \not\sim C \\ (A \to B) \sim (C \sqcap D) &= (A \to B) \sim C \text{ and } (A \to B) \sim D \\ (A \sqcap B) \sim n' &= A \sim n' \text{ and } B \sim n' \\ (A \sqcap B) \sim (C \sqcap D) &= A \sim C \text{ and } A \sim D \text{ and } B \sim C \text{ and } B \sim D\end{aligned}\]

Turning back to the Join Theorem, I restated it in terms of the intersection type system and rebranded it the Meet Theorem. Instead of using the letter \(\rho\) for environments, we shall switch to \(\Gamma\) because they now contain types instead of values.
Theorem (Meet)
If \(\Gamma \vdash e : A\), \(\Gamma' \vdash e : B\), and \(\Gamma \sim \Gamma'\), then \(\Gamma\sqcap\Gamma' \vdash e : A \sqcap B\).
By restating the theorem in terms of intersection types, we have essentially arrived at the rule for intersection introduction. In other words, if we can prove this theorem we will have shown that the intersection introduction rule is admissible in our system.

While the switch to intersection types and subtyping enabled this top-level proof to go through, I got stuck on one of the lemmas that it requires, which is an adaptation of Proposition 3 of the prior blog post.
Lemma (Consistency and Subtyping)

  1. If \(A \sim B\), \(A <: C\), and \(B <: D\), then \(C \sim D\).

  2. If \(A \not\sim B\), \(C <: A\), \(D <: B\), then \(C \not\sim D\).

In particular, I got stuck in the cases where the subtyping \(A <: C\) or \(B <: D\) was derived using the transitivity rule.

Subtyping and the Sub-formula Property

For a long time I’ve disliked definitions of subtyping in which transitivity is given as a rule instead of proved as a theorem. There are several reasons for this: a subtyping algorithm can’t directly implement a transitivity rule (or any rule that is not syntax directed), reasoning by induction or cases (inversion) is more difficult, and it is redundant. Furthermore, the presence of the transitivity rule means that subtyping does not satisfy the sub-formula property. This term sub-formula property comes from logic, and means that a derivation (proof) of a formula only mentions propositions that are a part of the formulate to be proved. The transitivity rule breaks this property because the type \(B\) comes out of nowhere, it is not part of \(A\) or \(C\), the types in the conclusion of the rule.

So I removed the transitivity rule and tried to prove transitivity. For most type systems, proving the transitivity of subtyping is straightforward. But I soon realized that the addition of the distributivity rule makes it significantly more difficult. After trying and failing to prove transitivity for some time, I resorted to reading the literature. Unfortunately, it turns out that none of the published intersection type systems satisfied the sub-formula property and vast majority of them included the transitivity rule. However, there was one paper that offered some hope. In a 2012 article in Fundamenta Informaticae titled Intersection Types with Subtyping by Means of Cut Elimination, Olivier Laurent defined subtyping without transitivity and instead proved it, but his system still did not satisfy the sub-formula property because of an additional rule for function types. Nevertheless, Olivier indicated that he was interested in finding a version of the system that did, writing

“it would be much nicer and much more natural to go through a sub-formula property”

A lot of progress can happen in six years, so I sent an email to Olivier. He replied,

“Indeed! I now have two different sequent-calculus systems which are equivalent to BCD subtyping and satisfy the sub-formula property. I am currently writting a paper on this but it is not ready yet.”

and he attached the paper draft and the Coq mechanization. What great timing! Furthermore, Olivier would be presenting the paper, titled Intersection Subtyping with Constructors, at the Workshop on Intersection Types and Related System in Oxford on July 8, part of the Federated Logic Conference (FLOC). I was planning to attend FLOC anyways, for the DOMAINS workshop to celebrate Dana Scott’s 85th birthday.

Olivier’s systems makes two important changes compared to prior work: he combines the distributivity rule and the usual arrow rule into a single elegant rule, and to enable this, he generalizes the form of subtyping from \(A <: B\) to \(A_1,\ldots,A_n \vdash B\), which should be interpreted as meaning \(A_1 \sqcap \cdots \sqcap A_n <: B\). Having a sequence of formulas (types) on the left is characteristic of proof systems in logic, including both natural deduction systems and sequence calculi. (Sequent calculi, in addition, typically have a sequence on the right that means the disjunction of the formulas.) Here is one of Olivier’s systems, adapted to my setting, which I’ll describe below. Let \(\Gamma\) range over sequences of types. \[\begin{gathered} \frac{\Gamma_1, \Gamma_2 \vdash A} {\Gamma_1 , n, \Gamma_2 \vdash A} \qquad \frac{\Gamma_1, \Gamma_2 \vdash A} {\Gamma_1 , B \to C, \Gamma_2 \vdash A} \\[2ex] \frac{\Gamma \vdash A \quad \Gamma \vdash B}{\Gamma \vdash A \sqcap B} \qquad \frac{\Gamma_1,B,C,\Gamma_2 \vdash A}{\Gamma_1,B\sqcap C,\Gamma_2 \vdash A} \\[2ex] \frac{}{n \vdash n} \qquad \frac{A \vdash C_1, \ldots, C_n \quad D_1, \ldots, D_n \vdash B} {C_1\to D_1,\ldots, C_n\to D_n \vdash A \to B}\end{gathered}\] The first two rules are weakening rules for singleton integers and function types. There is no weakening rule for intersections. The third and fourth rules are introduction and elimination rules for intersection. The fifth rule is reflexivity for integers, and the last is the combined rule for function types.

The combined rule for function types says that the intersection of a sequence of function types \({\sqcap}_{i=1\ldots n} (C_i\to D_i)\) is a subtype of \(A \to B\) if \[A <: {\sqcap}_{i\in\{1\ldots n\}} C_i \qquad \text{and}\qquad {\sqcap}_{i\in\{1\ldots n\}} D_i <: B\] Interestingly, the inversion principle for this rule is the \(\beta\)-sound property described in Chapter 14 of Lambda Calculus with Types by Barendregt et al., and is the key to proving \(\beta\)-equality. In Olivier’s system, \(\beta\)-soundness falls out immediately, instead of by a somewhat involved proof.

The regular subtyping rule for function types is simply an instance of the combined rule in which the sequence on the left contains just one function type.

The next step for me was to enter Olivier’s definitions into Isabelle and prove transitivity via cut elimination. That is, I needed to prove the following generalized statement via a sequence of lemmas laid out by Olivier in his draft.
Theorem (Cut Elimination)
If \(\Gamma_2 \vdash B\) and \(\Gamma_1,B,\Gamma_3 \vdash C\), then \(\Gamma_1,\Gamma_2,\Gamma_3 \vdash C\).
The transitivity rule is the instance of cut elimination where \(\Gamma_2 = A\) and both \(\Gamma_1\) and \(\Gamma_3\) are empty.

Unfortunately, I couldn’t resist making changes to Olivier’s subtyping system as I entered it into Isabelle, which cost me considerable time. Some of Olivier’s lemmas show that the collection of types on the left, that is, the \(A's\) in \(A_1,\ldots, A_n \vdash B\), behave like a set instead of a sequence. I figured that if the left-hand-side was represented as a set, then I would be able to bypass several lemmas and obtain a shorter proof. I got stuck in proving Lemma \(\cap L_e\) which states that \(\Gamma_1,A\sqcap B,\Gamma_2 \vdash C\) implies \(\Gamma_1,A, B,\Gamma_2 \vdash C\). Olivier’s subtyping rules are carefully designed to minimize the amount of overlap between the rules, and switching to a set representation increases the amount of overlap, making the proof of this lemma more difficult (perhaps impossible?).

So after struggling with the set representation for some time, I went back to sequences and was able to complete the proof of cut elimination, with a little help from Olivier at FLOC. I proved the required lemmas in the following order.
Lemma (Weakening)
If \(\Gamma_1,\Gamma_2 \vdash A\), then \(\Gamma_1,B,\Gamma_2 \vdash A\).
(Proved by induction on \(A\).)
Lemma (Axiom)
\(A \vdash A\)
(Proved by induction on \(A\).)
Lemma (Permutation)
If \(\Gamma_1 \vdash A\) and \(\Gamma_2\) is a permutation of \(\Gamma_1\), then \(\Gamma_2 \vdash A\).
(Proved by induction on the derivation of \(\Gamma_1 \vdash A\), using many lemmas about permutations.)
Lemma (\(\cap L_e\))
If \(\Gamma_1,A\sqcap B,\Gamma_2 \vdash C\), then \(\Gamma_1,A, B,\Gamma_2 \vdash C\).
(Proved by induction on the derivation of \(\Gamma_1,A\sqcap B,\Gamma_2 \vdash C\).)
Lemma (Collapse Duplicates)
If \(\Gamma_1,A,A,\Gamma_2 \vdash C\), then \(\Gamma_1,A,\Gamma_2 \vdash C\).
(This is proved by well-founded induction on the lexicographical ordering of the pair \((n,k)\) where \(n\) is the size of \(A\) and \(k\) is the depth of the derivation of \(\Gamma_1,A,A,\Gamma_2 \vdash C\). Proof assistants such as Isabelle and Coq do not directly provide the depth of a derivation, but the depth can be manually encoded as an extra argument of the relation, as in \(\Gamma_1,A,A,\Gamma_2 \vdash_k C\).)
The Cut Elimination Theorem is then proved by well-founded induction on the triple \((n,k_1,k_2)\) where \(n\) is the size of B, \(k_1\) is the depth of the derivation of \(\Gamma_2 \vdash B\), and \(k_2\) is the depth of the derivation of \(\Gamma_1,B,\Gamma_3 \vdash C\).

We define subtyping as follows. \[A <: B \quad = \quad A \vdash B\]

The BCD subtyping rules and other derived rules follow from the above lemmas.
Proposition (Properties of Subtyping)

  1. \(A <: A\).

  2. If \(A <: B\) and \(B <: C\), then \(A <: C\).

  3. If \(C <: A\) and \(B <: D\), then \(A \to B <: C \to D\).

  4. If \(A_1 <: B\), then \(A_1 \sqcap A_2 <: B\).

  5. If \(A_2 <: B\), then \(A_1 \sqcap A_2 <: B\).

  6. If \(B <: A_1\) and \(B <: A_2\), then \(B <: A_1 \sqcap A_2\).

  7. If \(A <: C\) and \(B <: D\), then \(A \sqcap B <: C \sqcap D\).

  8. \((A\to B) \sqcap (A \to C) <: A \to (B \sqcap C)\).

  9. \((A \to C) \sqcap (B \to D) <: (A\sqcap B) \to (C \sqcap D)\)

Consistency and Subtyping, Resolved

Recall that my switch to intersection types was motivated by my failure to prove the Consistency and Subtyping Lemma. We now return to the proof of that Lemma. We start with a handful of lemmas that are needed for that proof.
Lemma (Consistency is Symmetric and Reflexive)

  1. If \(A \sim B\), then \(B \sim A\).

  2. If \({\mathsf{wf}(A)}\), then \(A \sim A\).

It will often be convenient to decompose a type into its set of atoms, defined as follows. \[\begin{aligned} {\mathit{atoms}(n)} &= \{ n \} \\ {\mathit{atoms}(A\to B)} &= \{ A \to B \} \\ {\mathit{atoms}(A \sqcap B)} &= {\mathit{atoms}(A)} \cup {\mathit{atoms}(B)}\end{aligned}\]

The consistency of two types is determined by the consistency of its atoms.
Lemma (Atomic Consistency)

  1. If \(A \sim B\), \(C \in {\mathit{atoms}(A)}\), and \(D \in {\mathit{atoms}(B)}\), then \(C \sim D\).

  2. If (for any \(C \in {\mathit{atoms}(A)}\) and \(D \in {\mathit{atoms}(B)}\), \(C \sim D\)), then \(A \sim B\).

  3. If \(A \not\sim B\), then \(C \not\sim D\) for some \(C \in {\mathit{atoms}(A)}\) and \(D \in {\mathit{atoms}(B)}\).

  4. If \(C \not\sim D\), \(C \in {\mathit{atoms}(A)}\), and \(D \in {\mathit{atoms}(B)}\), then \(A \not\sim B\).

There are also several properties of subtyping and the atoms of a type.
Lemma (Atomic Subtyping)

  1. If \(A <: B\) and \(C \in {\mathit{atoms}(B)}\), then \(A <: C\).

  2. If \(A <: n\), then \(n \in {\mathit{atoms}(A)}\).

  3. \(n <: A\) if and only if \({\mathit{atoms}(A)} \subseteq \{ n \}\).

  4. If \(C <: A \to B\), then \(D\to E \in {\mathit{atoms}(C)}\) for some \(D,E\).

  5. If \(\Gamma \vdash A\) and every atom in \(\Gamma\) is a function type, then every atom of \(A\) is a function type.

And we have the following important inversion lemma for function types. We use the following abbreviations: \[\begin{aligned} \mathrm{dom}(\Gamma) &= \{ A \mid \exists B.\; A \to B \in \Gamma \}\\ \mathrm{cod}(\Gamma) &= \{ B \mid \exists A.\; A \to B \in \Gamma \}\end{aligned}\]

Lemma (Subtyping Inversion for Function Types)
If \(C <: A \to B\), then there is a sequence of function types \(\Gamma\) such that

  1. each element of \(\Gamma\) is an atom of \(C\),

  2. For every \(D\to E \in \Gamma\), we have \(A <: D\), and

  3. \({\sqcap}\mathrm{cod}(\Gamma) <: B\).

Note that item 2 above implies that \(A <: {\sqcap}\mathrm{dom}(\Gamma)\).

Lemma (Consistency and Subtyping)

  1. If \(A \sim B\), \(A <: C\), and \(B <: D\), then \(C \sim D\).

  2. If \(A \not\sim B\), \(C <: A\), \(D <: B\), then \(C \not\sim D\).

(1) The proof is by strong induction on the sum of the depths of \(A\), \(B\), \(C\), and \(D\). We define the depth of a type as follows. \[\begin{aligned} \mathit{depth}(n) &= 0 \\ \mathit{depth}(A \to B) &= 1 + \mathrm{max}(\mathit{depth}(A),\mathit{depth}(B)) \\ \mathit{depth}(A \sqcap B) &= \mathrm{max}(\mathit{depth}(A),\mathit{depth}(B)) \end{aligned}\] To show that \(C \sim D\) it suffices to show that all of their atoms are consistent. Suppose \(C' \in {\mathit{atoms}(C)}\) and \(D'\in{\mathit{atoms}(D)}\). So we need to show that \(C' \sim D'\). We proceed by cases on \(C'\).

  • Case \(C'=n_1\):
    We have \(A <: C'\) and therefore \(n_1 \in {\mathit{atoms}(A)}\). Then because \(A \sim B\), we have \({\mathit{atoms}(B)} \subseteq \{n_1\}\). We have \(B <: D'\), so we also have \({\mathit{atoms}(D)} \subseteq \{n_1\}\). Therefore \(C' \sim D'\).

  • Case \(C'=C_1\to C_2\):
    We have \(A <: C_1 \to C_2\), so by inversion we have some sequence of function types \(\Gamma_1\) such that every element of \(\Gamma_1\) is an atom of \(A\), \(C_1 <: {\sqcap}\mathrm{dom}(\Gamma_1)\), and \({\sqcap}\mathrm{cod}(\Gamma_1) <: C_2\).

    We also know that \(D'\) is a function type, say \(D'=D_1 \to D_2\). (This is because we have \(A <: C'\), so we know that \(A_1\to A_2 \in {\mathit{atoms}(A)}\) for some \(A_1,A_2\). Then because \(A \sim B\), we know that all the atoms in \(B\) are function types. Then because \(B <: D\) and \(D' \in {\mathit{atoms}(D)}\), we have that \(D'\) is a function type.) So by inversion on \(B <: D_1 \to D_2\), we have some sequence of function types \(\Gamma_2\) such that every element of \(\Gamma_2\) is an atom of \(B\), \(D_1 <: {\sqcap}\mathrm{dom}(\Gamma_2)\), and \({\sqcap}\mathrm{cod}(\Gamma_2) <: D_2\).

    It’s the case that either \(C_1 \sim D_1\) or \(C_1 \not\sim D_1\).

    • Sub-case \(C_1 \sim D_1\).
      It suffices to show that \(C_2 \sim D_2\). By the induction hypothesis, we have \({\sqcap}\mathrm{dom}(\Gamma_1) \sim {\sqcap}\mathrm{dom}(\Gamma_2)\).

      As an intermediate step, we shall prove that \({\sqcap}\mathrm{cod}(\Gamma_1) \sim {\sqcap}\mathrm{cod}(\Gamma_2)\), which we shall do by showing that all their atoms are consistent. Suppose \(A' \in {\mathit{atoms}({\sqcap}\mathrm{cod}(\Gamma_1))}\) and \(B' \in {\mathit{atoms}({\sqcap}\mathrm{cod}(\Gamma_2))}\). There is some \(A_1\to A_2 \in \Gamma_1\) where \(A' \in {\mathit{atoms}(A_2)}\). Similarly, there is \(B_1 \to B_2 \in \Gamma_2\) where \(B' \in {\mathit{atoms}(B_2)}\). Also, we have \(A_1 \to A_2 \in {\mathit{atoms}(A)}\) and \(B_1 \to B_2 \in {\mathit{atoms}(B)}\). Then because \(A \sim B\), we have \(A_1 \to A_2 \sim B_1 \to B_2\). Furthermore, we have \(A_1 \sim B_1\) because \({\sqcap}\mathrm{dom}(\Gamma_1) \sim {\sqcap}\mathrm{dom}(\Gamma_2)\), so it must be the case that \(A_2 \sim B_2\). Then because \(A' \in {\mathit{atoms}(A_2)}\) and \(B' \in {\mathit{atoms}(B_2)}\), we have \(A' \sim B'\). Thus concludes this intermediate step.

      By another use of the induction hypothesis, we have \(C_2 \sim D_2\), and this case is finished.

    • Sub-case \(C_1 \not\sim D_1\).
      Then we immediately have \(C_1 \to C_2 \sim D_1 \to D_2\).

  • Case \(C'=C_1\sqcap C_2\):
    We already know that \(C'\) is an atom, so we have a contradiction and this case is vacously true.

The next two lemmas follow from the Consistency and Subtyping Lemma and help prepare to prove the case for application in the Join Theorem.
Lemma (Application Consistency)
If \(A_1 \sim A_2\), \(B_1 \sim B_2\), \(A_1 <: B_1 \to C_1\), \(A_2 <: B_2 \to C_2\), and all these types are well formed, then \(C_1 \sim C_2\).
(This lemma is proved directly, without induction.)
Lemma (Application Intersection)
If \(A_1 <: B_1 \to C_1\), \(A_2 <: B_2 \to C_2\), \(A_1 \sim A_2\), \(B_1 \sim B_2\), and \(C_1 \sim C_2\), then \((A_1\sqcap A_2) <: (B_1 \sqcap B_2) \to (C_1 \sqcap C_2)\).
(This lemma is proved directly, without induction.)

Updating the Denotational Semantics

Armed with the Consistency and Subtyping Lemma, I turned back to the proof of the Join Theorem, but first I needed to update my denotational semantics to use intersection types instead of values. For this we’ll need the definition of well formed types that we alluded to earlier.

\[\begin{gathered} \frac{}{{\mathsf{wf}(n)}} \qquad \frac{{\mathsf{wf}(A)} \quad {\mathsf{wf}(B)}}{{\mathsf{wf}(A \to B)}} \qquad \frac{A \sim B \quad {\mathsf{wf}(A)} \quad {\mathsf{wf}(B)}}{{\mathsf{wf}(A \sqcap B)}}\end{gathered}\]

Here are some examples and non-examples of well-formed types. \[\begin{gathered} {\mathsf{wf}(4)} \qquad {\mathsf{wf}(3 \sqcap 3)} \qquad \neg {\mathsf{wf}(3 \sqcap 4)} \\ {\mathsf{wf}((0\to 1) \sqcap (2 \to 3))} \qquad \neg {\mathsf{wf}((0 \to 1) \sqcap (0 \to 2))}\end{gathered}\] It is sometimes helpful to think of well-formed types in terms of the equivalence classes determined by subtype equivalence: \[A \approx B \quad = \quad A <: B \text{ and } B <: A\] For example, we have \(3 \approx (3 \sqcap 3)\), so they are in the same equivalence class and \(3\) would be the representative.

We also introduce the following notation for all the well-formed types that are super-types of a given type. \[{\mathord{\uparrow} A} \quad = \quad \{ B\mid A <: B \text{ and } {\mathsf{wf}(B)} \}\]

We shall represent variables with de Bruijn indices, so an environment \(\Gamma\) is a sequence of types. The denotational semantics of the CBV \(\lambda\)-calculus is defined as follows. \[\begin{aligned} {\mathcal{E}{[\![ n ]\!]}}\Gamma &= {\mathord{\uparrow} n} \\ {\mathcal{E}{[\![ x ]\!]}}\Gamma &= {\mathrm{if}\;}x < |\Gamma| {\;\mathrm{then}\;}{\mathord{\uparrow} \Gamma[k]} {\;\mathrm{else}\;}\emptyset \\ {\mathcal{E}{[\![ \lambda e ]\!]}}\Gamma &= \{ A \mid {\mathsf{wf}(A)} \text{ and } {\mathcal{F}{[\![ A ]\!]}}e\Gamma \} \\ {\mathcal{E}{[\![ e_1{\;}e_2 ]\!]}}\Gamma &= \left\{ C\, \middle| \begin{array}{l} \exists A,B.\; A \in {\mathcal{E}{[\![ e_1 ]\!]}}\Gamma, B \in {\mathcal{E}{[\![ e_2 ]\!]}}\Gamma,\\ A <: B \to C, \text{ and } {\mathsf{wf}(C)} \end{array} \right\} \\ {\mathcal{E}{[\![ f(e_1,e_2) ]\!]}}\Gamma &= \left\{ C\, \middle| \begin{array}{l} \exists A,B,n_1,n_2.\; A \in {\mathcal{E}{[\![ e_1 ]\!]}}\Gamma, B \in {\mathcal{E}{[\![ e_2 ]\!]}}\Gamma,\\ A <: n_1, B <: n_2, {[\![ f ]\!]}(n_1,n_2) <: C, {\mathsf{wf}(C)} \end{array} \right\} \\ {\mathcal{E}{[\![ {\mathrm{if}\;}e_1 {\;\mathrm{then}\;}e_2 {\;\mathrm{else}\;}e_3 ]\!]}}\Gamma &= \left\{ B\, \middle| \begin{array}{l} \exists A, n.\; A \in {\mathcal{E}{[\![ e_1 ]\!]}}\Gamma, A <: n,\\ n = 0 \Rightarrow B \in {\mathcal{E}{[\![ e_3 ]\!]}}\Gamma,\\ n \neq 0 \Rightarrow B \in {\mathcal{E}{[\![ e_2 ]\!]}}\Gamma \end{array} \right\} \\[2ex] {\mathcal{F}{[\![ n ]\!]}}e\Gamma &= \mathit{false} \\ {\mathcal{F}{[\![ A \sqcap B ]\!]}}e \Gamma &= {\mathcal{F}{[\![ A ]\!]}}e\Gamma \text{ and } {\mathcal{F}{[\![ B ]\!]}}e\Gamma\\ {\mathcal{F}{[\![ A \to B ]\!]}}e \Gamma &= B \in {\mathcal{E}{[\![ e ]\!]}} (A, \Gamma)\end{aligned}\]

It is easy to show that swapping in a “super” environment does not change the semantics.

Lemma (Weakening)

  1. If \({\mathcal{F}{[\![ A ]\!]}}e \Gamma_1\), \(\Gamma_1 <: \Gamma_2\) and \((\forall B, \Gamma_1, \Gamma_2.\; B \in {\mathcal{E}{[\![ e ]\!]}}\Gamma_1, \Gamma_2 <: \Gamma_1 \Rightarrow B \in {\mathcal{E}{[\![ e ]\!]}}\Gamma_2)\), then \({\mathcal{F}{[\![ A ]\!]}}e \Gamma_2\).

  2. If \(A \in {\mathcal{E}{[\![ e ]\!]}}\Gamma_1\) and \(\Gamma_2 <: \Gamma_1\), then \(A \in {\mathcal{E}{[\![ e ]\!]}}\Gamma_2\).

(Part 1 is proved by induction on \(A\). Part 2 is proved by induction on \(e\) and uses part 1.)

The Home Stretch

Now for the main event, the proof of the Meet Theorem!
Theorem (Meet)
If \(A_1 \in {\mathcal{E}{[\![ e ]\!]}}\Gamma_1\), \(A_2 \in {\mathcal{E}{[\![ e ]\!]}}\Gamma_2\), both \(\Gamma_1\) and \(\Gamma_2\) are well formed, and \(\Gamma_1 \sim \Gamma_2\),
then \(A_1 \sqcap A_2 \in {\mathcal{E}{[\![ e ]\!]}}(\Gamma_1\sqcap\Gamma_2)\) and \({\mathsf{wf}(A_1 \sqcap A_2)}\).
Proof We proceed by induction on \(e\).

  • Case \(e=k\) (\(k\) is a de Bruijn index for a variable):
    We have \(\Gamma_1[k] <: A_1\) and \(\Gamma_2[k] <: A_2\), so \(\Gamma_1[k] \sqcap \Gamma_2[k] <: A_1 \sqcap A_2\). Also, because \(\Gamma_1 \sim \Gamma_2\) we have \(\Gamma_1[k] \sim \Gamma_2[k]\) and therefore \(A_1 \sim A_2\), by the Consistency and Subtyping Lemma. So we have \({\mathsf{wf}(A_1 \sqcap A_2)}\) and this case is finished.

  • Case \(e=n\):
    We have \(n <: A_1\) and \(n <: A_2\), so \(n <: A_1 \sqcap A_2\). Also, we have \(A_1 \sim A_2\) by the Consistency and Subtyping Lemma. So we have \({\mathsf{wf}(A_1 \sqcap A_2)}\) and this case is finished.

  • Case \(e=\lambda e\):
    We need to show that \({\mathsf{wf}(A_1 \sqcap A_2)}\) and \({\mathcal{F}{[\![ A_1 \sqcap A_2 ]\!]}}e(\Gamma_1\sqcap\Gamma_2)\). For the later, it suffices to show that \(A_1 \sim A_2\), which we shall do by showing that their atoms are consistent. Suppose \(A'_1 \in {\mathit{atoms}(A_1)}\) and \(A'_2 \in {\mathit{atoms}(A_2)}\). Because \({\mathcal{F}{[\![ A_1 ]\!]}}e\Gamma_1\) we have \(A'_1 =A'_{11} \to A'_{12}\) and \(A'_{12} \in {\mathcal{E}{[\![ e ]\!]}}(A'_{11},\Gamma_1)\). Similarly, from \({\mathcal{F}{[\![ A_2 ]\!]}}e\Gamma_2\) we have \(A'_2 =A'_{21} \to A'_{22}\) and \(A'_{22} \in {\mathcal{E}{[\![ e ]\!]}}(A'_{21},\Gamma_2)\). We proceed by cases on whether \(A'_{11} \sim A'_{21}\).

    • Sub-case \(A'_{11} \sim A'_{21}\):
      By the induction hypothesis, we have \({\mathsf{wf}(A'_{12} \sqcap A'_{22})}\) from which we have \(A'_{12} \sim A'_{22}\) and therefore \(A'_{11}\to A'_{12} \sim A'_{21} \to A'_{22}\).

    • Sub-case \(A'_{11} \not\sim A'_{21}\):
      It immediately follows that \(A'_{11}\to A'_{12} \sim A'_{21} \to A'_{22}\).

    It remains to show \({\mathcal{F}{[\![ A_1 \sqcap A_2 ]\!]}}e(\Gamma_1\sqcap\Gamma_2)\). This follows from two uses of the Weakening Lemma to obtain \({\mathcal{F}{[\![ A_1 ]\!]}}e(\Gamma_1\sqcap\Gamma_2)\) and \({\mathcal{F}{[\![ A_2 ]\!]}}e(\Gamma_1\sqcap\Gamma_2)\).

  • Case \(e = (e_1 {\;}e_2)\):
    We have \[B_1 \in {\mathcal{E}{[\![ e_1 ]\!]}}\Gamma_1 \quad C_1 \in {\mathcal{E}{[\![ e_2 ]\!]}}\Gamma_1 \quad B_1 <: C_1 \to A_1 \quad {\mathsf{wf}(A_1)}\] and \[B_2 \in {\mathcal{E}{[\![ e_1 ]\!]}}\Gamma_2 \quad C_2 \in {\mathcal{E}{[\![ e_2 ]\!]}}\Gamma_2 \quad B_2 <: C_2 \to A_2 \quad {\mathsf{wf}(A_2)}\] By the induction hypothesis, we have \[B_1 \sqcap B_2 \in {\mathcal{E}{[\![ e_1 ]\!]}}(\Gamma_1 \sqcap \Gamma_2) \quad {\mathsf{wf}(B_1 \sqcap B_2)}\] and \[C_1 \sqcap C_2 \in {\mathcal{E}{[\![ e_2 ]\!]}}(\Gamma_1 \sqcap \Gamma_2) \quad {\mathsf{wf}(C_1 \sqcap C_2)}\] We obtain \(A_1 \sim A_2\) by the Application Consistency Lemma, and then by the Application Intersection Lemma we have \[B_1 \sqcap B_2 <: (C_1 \sqcap C_2) \to (A_1 \sqcap A_2)\] So we have \(A_1 \sqcap A_2 \in {\mathcal{E}{[\![ e ]\!]}}(\Gamma_1 \sqcap \Gamma_2)\).

    Also, from \(A_1 \sim A_2\), \({\mathsf{wf}(A_1)}\), and \({\mathsf{wf}(A_2)}\), we conclude that \({\mathsf{wf}(A_1 \sqcap A_2)}\).

  • Case \(e= f(e_1,e_2)\):
    (This case is not very interesting. See the Isabelle proof for the details.)

  • Case \(e= {\mathrm{if}\;}e_1 {\;\mathrm{then}\;}e_2 {\;\mathrm{else}\;}e_3\):
    (This case is not very interesting. See the Isabelle proof for the details.)

I thought that the following Subsumption Theorem would be needed to prove the Meet Theorem, but it turned out not to be necessary, which is especially nice because the proof of the Subsumption Theorem turned out to depend on the Meet Theorem!
Theorem (Subsumption)
If \(A \in {\mathcal{E}{[\![ e ]\!]}}\Gamma\), \(A <: B\), and both \(B\) and \(\Gamma\) are well-formed, then \(B \in {\mathcal{E}{[\![ e ]\!]}}\Gamma\).
The proof is by induction on \(e\) and all but the case \(e=\lambda e'\) are straightforward. For that case, we use the following lemmas.
Lemma (Distributivity for \(\mathcal{F}\))
If \({\mathcal{F}{[\![ (A \to B)\sqcap (C \to D) ]\!]}} e \Gamma\), \(A \sim C\), and everything is well formed, then \({\mathcal{F}{[\![ (A\sqcap C) \to (B\sqcap D) ]\!]}} e \Gamma\).
(The proof is direct, using the Meet Theorem and the Weakening Lemma.)
Lemma (\(\mathcal{F}\) and Intersections)
Suppose \(\Gamma_1\) is a non-empty sequence of well-formed and consistent function types. If \({\mathcal{F}{[\![ {\sqcap}\Gamma_1 ]\!]}} e \Gamma_2\), then \({\mathcal{F}{[\![ {\sqcap}\mathrm{dom}(\Gamma_1) \to {\sqcap}\mathrm{cod}(\Gamma_1) ]\!]}} e \Gamma_2\).
(The proof is by induction on \(\Gamma_1\) and uses the previous lemma.)


This result can be viewed a couple ways. As discussed at the beginning of this post, establishing the Meet Theorem means that the this call-by-value denotational semantics respects \(\beta\)-equality for any terminating argument expression. This is useful in proving the correctness of a function inlining optimizer. Also, it would be straightforward to define a call-by-name (or need) version of the semantics that respects \(\beta\)-equality unconditionally.

Secondly, from the viewpoint of intersection type systems, this result shows that, once we require types to be well formed (i.e. self consistent), we no longer need the intersection introduction rule because it is a consequence of having the subtyping rule for distributing intersections through function types.