Gradually typed languages enable programmers to choose which parts of their programs are statically typed and which parts are dynamically typed. Thus, gradually typed languages perform some type checking at compile time and some type checking at run time. When specifying the semantics of a gradually typed language, we usually express the run time checking in terms of casts. Thus, the semantics of a gradually typed language depends crucially on the semantics of casts. This blog post tries to answer the question: "What is a cast?"
Syntax and Static Semantics of Casts
Syntactically, a cast is an expression of the form e:A⇒ℓB
Regarding the static semantics (compile-time type checking), a cast enables the sub-expression e of static type A to be used in a context expecting a different type B. Γ⊢e:AΓ⊢(e:A⇒ℓB):B
Dynamic Semantics of Casts
The dynamic semantics of a cast is to check whether the value produced by subexpression e is of the target type B and if so, return the value; otherwise signal an error. The following is a strawman denotational semantics that expresses this basic intuition about casts. Suppose we have already defined the meaning of types, so T[|A|] is the set of values of type A. The meaning function E[|e|] maps an expression to a result (either a value v or error blameℓ). E[|e:A⇒ℓB|]={vif v∈T[|B|]blameℓif v∉T[|B|]where v=E[|e|]
If we restrict ourselves to first-order types such as Int, it is straightforward to define T and check whether a value is in the set. T[|Int|]=Z
Competing Operational Semantics for Casts
We'll come back to denotational semantics in a bit, but first let's turn to operational semantics, in particular reduction semantics, which is what the recent literature uses to explains casts and the type ⋆ (Gronski 2006, Siek 2007, Wadler 2009). In a reduction semantics, we give rewrite rules to say what happens when a syntactic value flows into a cast, that is, we say what expression the cast reduces to. Recall that a syntactic value is just an expression that cannot be further reduced. We can proceed by cases on the consistency of the source type A and target type B.
- Case (v:Int⇒ℓInt).
This one is easy, the static type system ensures that v has type
Int, so there is nothing to check and we can rewrite to
v.
v:Int⇒ℓInt⟶v
- Case (v:⋆⇒ℓ⋆).
This one is also easy.
v:⋆⇒ℓ⋆⟶v
- Case (v:A→A′⇒ℓB→B′). This one
is more complicated. We'd like to check that the function v has
type B→B′. Suppose B′=Int. How can we
determine whether a function returns an integer? In general, that's
just as hard as the halting problem, which is undecidable. So instead
of checking now, we'll delay the checking until when the
function is called. We can accomplish this by rewriting to a lambda
expression that casts the input, calls v, and then casts the
output.
v:A→A′⇒ℓB→B′⟶λx:B.(v(x:B⇒ℓA)):A′⇒ℓB′Here we see the importance of attaching blame labels to casts. Because of the delayed checking, the point of error can be far removed from the original source code location, but thanks to the blame label we can point back to the source location of the cast that ultimately failed (Findler and Felleisen 2002).
- Case (v:A⇒ℓ⋆). For this one there
are multiple options in the literature. One option is declare this
as a syntactic value (Siek 2009), so no rewrite rule is
necessary. Another option is to factor all casts to ⋆
through the ground types G:
G::=Int∣⋆→⋆Then we expand the cast from A to ⋆ into two casts that go through the unique ground type for A. v:A⇒ℓ⋆⟶(v:A⇒ℓG):G⇒ℓ⋆where A∼G,A≠G,A≠⋆and then declare that expressions of the form (v:G⇒ℓ⋆) are values (Wadler 2009).
- Case (v:⋆⇒ℓB). There are multiple
options here as well, but the choice is linked to the above choice
regarding casting from A to ⋆. If v=(v′:A⇒ℓ′⋆), then we need the following rewrite
rules
(v′:A⇒ℓ′⋆):⋆⇒ℓB⟶v′:A⇒ℓBif A∼B(v′:A⇒ℓ′⋆):⋆⇒ℓB⟶blameℓif A≁BOn the other hand, if we want to factor through the ground types, we have the following reduction rules. v:⋆⇒ℓB⟶v:⋆⇒ℓG⇒ℓBif B∼G,B≠G,B≠⋆(v:G⇒ℓ′⋆):⋆⇒ℓG⟶v(v:G⇒ℓ′⋆):⋆⇒ℓG′⟶blameℓif G≠G′
Given that we have multiple options regarding the reduction semantics, an immediate question is whether it matters, that is, can we actually observe different behaviors for some program? Yes, in the following example we cast the identity function on integers to an incorrect type. letid=(λx:Int.x)inletf=(id:Int→Int⇒ℓ1⋆)inletg=(f:⋆⇒ℓ2(Int→Int)→Int)ingid
A Denotational Semantics of Functions and Casts
I've recently found out that there is a simple way to represent function values in a denotational semantics. The intuition is that, although a function may be able to deal with an infinite number of different inputs, the function only has to deal with a finite number of inputs on any one execution of the program. Thus, we can represent functions with finite tables of input-output pairs. An empty table is written ∅, a single-entry table has the form v↦v′ where v is the input and v′ is the corresponding output. We build a larger table out of two smaller tables v1 and v2 with the notation v1⊔v2. So, with the addition of integer values n∈Z, the following grammar specifies the values. v::=n∣∅∣v↦v∣v⊔v
Of course, we can't use just one fixed-size table as the denotation of a lambda expression. Depending on the context of the lambda, we may need a bigger table that handles more inputs. Therefore we map each lambda expression to the set of all finite tables that jive with that lambda. To be more precise, we shall define a meaning function E that maps an expression and an environment to a set of values, and an auxiliary function F that determines whether a table jives with a lambda expression in a given environment. Here's a first try at defining F. F(n,λx:A.e,ρ)=falseF(∅,λx:A.e,ρ)=trueF(v↦v′,λx:A.e,ρ)=T(A,v) and v′∈E[|e|]ρ(x:=v)F(v1⊔v2,λx:A.e,ρ)=F(v1,λx:A.e,ρ) and F(v2,λx:A.e,ρ)
Now that we have a good representation for function values, we can talk about giving meaning to higher-order casts, that is, casts from one function type to another. Recall that in our strawman semantics, we got stuck when trying to define the meaning of types in the form of map T from a type to a set of values. Now we can proceed based on the above definition of values v. (To make the termination of T more obvious, we'll instead define T has a map from a type and a value to a Boolean. The measure is a lexicographic ordering on the size of the type and then the size of the value.) T(Int,v)=(∃n.v=n)T(⋆,v)=trueT(A→B,n)=falseT(A→B,∅)=trueT(A→B,v↦v′)=T(A,v) and T(B,v′)T(A→B,v1⊔v2)=T(A→B,v1) and T(A→B,v2)
Of course, we need to extend values to include blame: v::=n∣∅∣v↦v∣v⊔v∣blameℓ
Two Examples
Let us consider the ramifications of this semantics. The following example program creates a function f that returns 1 on non-zero input and returns the identity function when applied to 0. We cast this function to the type Int→Int on two separate occasions, cast ℓ3 and cast ℓ4, to create g and h. We apply g to 1 and h to its result. letf=(λx:Int.ifxthen(0:Int⇒ℓ1⋆)else((λy:Int.y):Int→Int⇒ℓ2⋆))inletg=(f:Int→⋆⇒ℓ3Int→Int)inleth=(f:Int→⋆⇒ℓ4Int→Int)inletz=(g1)in(hz)
Because cast ℓ3 signals an error, one might be tempted to have the meaning of g be just {blameℓ3}. However, we want to allow implementations of this language that do not blame ℓ3 (g is never applied to 0 after all, so its guilt was not directly observable) and instead blame ℓ4, who was caught red handed. So it is important for the meaning of g to include the subset of values from f that have type Int→Int so that we can carry on and find other errors as well. We shall expect implementations of this language to be sound with respect to blame, that is, if execution results in blame, it should blame one of the labels that is in the denotation of the program (and not some other innocent cast).
Let us return to the example (P0). The denotation of that program is {blameℓ2} because the cast at ℓ2 is a cast to (Int→Int)→Int and the identity function is not of that type. The other case at ℓ1 is innocent because it is a cast to ⋆ and all values are of that type, including the identity cast.
Discussion
By giving the cast calculus a denotational semantics in terms of finite function tables, it became straightforward to define whether a function value is of a given type. This in turn made it easy to define the meaning of casts, even casts at function type. A cast succeeds if the input value is of the target type and it fails otherwise. With this semantics we assign blame to a cast in an eager fashion, without the need for the blame tracking machinery that is present in the operational semantics.
We saw an example program where the reduction semantics that factors through ground types attributes blame to a cast that the denotational semantics says is innocent. This lends some evidence to that semantics being less desirable.
I plan to investigate whether the alternative reduction semantics is sound with respect to the denotational semantics in the sense that the reduction semantics only blames a cast if the denotational semantics says it is guilty.
Appendix
We give the full definition of the cast calculus here in the appendix. The relation ⊑ that we used to define table lookup is the dual of the subtyping relation for intersection types. The denotational semantics is a mild reformation of the intersection type system that I discussed in previous blog posts.
Syntax A::=Int∣A→B∣⋆e::=n∣op(e,e)∣ifetheneelsee∣x∣λx:A∣ee∣e:A⇒ℓBReferences
- (Findler 2002) Contracts for higher-order functions. R. B. Findler and M. Felleisen. International Conference on Functional Programming. 2002.
- (Gronski 2006) Sage: Hybrid Checking for Flexible Specifications. Jessica Gronski and Kenneth Knowles and Aaron Tomb and Stephen N. Freund and Cormac Flanagan. Scheme and Functional Programming Workshop, 2006.
- (Scott 1970) Outline of a Mathematical Theory of Computation. Dana Scott. Oxford University. 1970. Technical report PRG-2.
- (Scott 1976) Data Types as Lattices. Dana Scott. SIAM Journal on Computing. 1976. Volume 5, Number 3.
- (Siek 2009) Exploring the Design Space of Higher-Order Casts. Jeremy G. Siek and Ronald Garcia and Walid Taha. European Symposium on Programming. 2009.
- (Wadler 2009) Well-typed programs can't be blamed. Philip Wadler and Robert Bruce Findler. European Symposium on Programming. 2009.