I briefly mentioned in my previous post that there are implementation challenges regarding gradual typing. One of those challenges regards space efficiency. In their paper Space-Efficient Gradual Typing, Herman, Tomb, and Flanagan observed two circumstances in which function casts can lead to unbounded space consumption. First, some programs repeatedly apply casts to the same function, resulting in a build-up of casts around the function. In the following example, each time the function bound to k is passed between even and odd a cast is added, causing a space leak proportional to n.
let rec even(n : Int, k : Dyn->Bool) : Bool = if (n = 0) then k(True : Bool => *) else odd(n - 1, k : *->Bool => Bool->Bool) and odd(n : Int, k : Bool->Bool) : Bool = if (n = 0) then k(False) else even(n - 1, k : Bool->Bool => *->Bool)
Second, some casts break proper tail recursion. Consider the following example in which the return type of even is ⋆ and odd is Bool.
let rec even(n : Int) : Dyn = if (n = 0) then True else odd(n - 1) : Bool => * and odd(n : Int) : Bool = if (n = 0) then False else even(n - 1) : * => BoolAssuming tail call optimization, cast-free versions of the even and odd functions require only constant space, but because the call to even is no longer a tail call, the run-time stack grows with each call and space consumption is proportional to n.
Herman et al.'s solution to this space-efficiency problem relies on Henglein's Coercion Calculus. This calculus defines a set of combinators, called coercions, that can be used to express casts. The key advantage of coercions is that an arbitrarily long sequence of coercions reduces to a sequence of at most three coercions. The following defines the syntax and typing rules for coercions. The two most important coercions are the injection coercion I! from I to ⋆ and the projection coercion I?ℓ from ⋆ to I. The definition of the injectable types I depends on the blame tracking strategy: I::=B∣⋆→⋆(UD)I::=B∣T→T(D) coercionsc::=ιT∣I!∣I?ℓ∣c→c∣c;c∣Failℓ⊢ιT:T⇒T⊢T!:T⇒⋆⊢T?ℓ:⋆⇒T⊢c1:T21⇒T11⊢c2:T12⇒T22⊢c1→c2:T11→T12⇒T21→T22⊢c1:T1⇒T2⊢c2:T2⇒T3⊢c1;c2:T1⇒T3⊢Failℓ:T1⇒T2 We sometimes drop the subscript on the ιT when the type T doesn't matter.
The way in which casts are compiled to coercions depends on whether you're using the D or UD blame tracking strategy. Here's the compilation function for D. CD(⋆⇒ℓ⋆)=ιCD(B⇒ℓB)=ιCD(⋆⇒ℓI)=I?ℓCD(I⇒ℓ⋆)=I!CD(T1→T2⇒ℓT3→T4)=c1→c2where c1=CD(T3⇒ℓT1)and c2=CD(T2⇒ℓT4)CD(T1⇒ℓT2)=Failℓif hd(T1)≁hd(T2) The compilation function for UD is a bit more complicated because the definition of injectable type is more restrictive. CUD(⋆⇒ℓ⋆)=ιCUD(B⇒ℓB)=ιCUD(⋆⇒ℓB)=B?ℓCUD(⋆⇒ℓT3→T4)=(⋆→⋆)?ℓ;(c1→c2)where c1=CD(T3⇒ℓ⋆)and c2=CD(⋆⇒ℓT4)CUD(B⇒ℓ⋆)=B!CUD(T1→T2⇒ℓ⋆)=(c1→c2);(⋆→⋆)!where c1=CD(⋆⇒ℓT1)and c2=CD(T2⇒ℓ⋆)CUD(T1→T2⇒ℓT3→T4)=c1→c2where c1=CD(T3⇒ℓT1)and c2=CD(T2⇒ℓT4)CUD(T1⇒ℓT2)=Failℓif hd(T1)≁hd(T2)
We identify coercion sequencing up to associativity and identity: c1;(c2;c3)=(c1;c2);c3(c;ι)=(ι;c)=c The following are the reduction rules for coercions. The compilation function C depends on the choice of blame tracking strategy (D or UD). I1!;I2?ℓ⟶C(I1⇒ℓI2)(c11→c12);(c21→c22)⟶(c21;c11)→(c12;c22)Failℓ;c⟶Failℓ¯c;Failℓ⟶Failℓ
The last reduction rule refers to ¯c, a subset of the coercions that we refer to as wrapper coercions. We define wrapper coercions and coercions in normal form ˆc as follows. optional injectionsinj::=ι∣I!optional function coercionsfun::=ι∣ˆc→ˆcoptional projectionsproj::=ι∣I?ℓwrapper coercions¯c::=fun;injnormal coercionsˆc::=proj;fun;inj∣proj;Failℓ Here we can easily see that coercions in normal form can always be represented by a coercion with a length of at most three.
Theorem (Strong Normalization for Coercions) For any coercion c, there exists a ˆc such that c⟶∗ˆc.
With the Coercion Calculus in hand, we can define a space-efficient abstract machine. This machine is a variant of my favorite abstract machine for the lambda calculus, the ECD machine on terms in A-normal form. Herman et al. define an efficient reduction semantics that relies on mutually-recursive evaluation contexts to enable the simplification of coercions in tail position. Our choice of the ECD machine allows us to deal with coercions in tail position in a more straightforward way. Here's the syntax for our coercion-based calculus in A-normal form. The two main additions are the cast definition and the tail-call statement. expressionse::=k∣x∣λx:T.sdefinitionsd::=x=op(e)∣x:T=e(e)∣x=e:ˆcstatementss::=d;s∣returne∣returne(e):ˆcsimple values˜v::=k∣⟨λx:T.s,ρ⟩valuesv::=˜v∣˜v:¯c
The evaluation function V maps expressions and environments to values in the usual way. V(k,ρ)=kV(x,ρ)=ρ(x)V(λx:T.s,ρ)=⟨λx:T.s,ρ⟩ The following auxiliary function applies a coercion to a value. cast(˜v,ˆc)={˜vif ˆc=ιBblameℓif ˆc=Failℓ˜v:ˆcotherwisecast(˜v:¯c1,^c2)={˜vif (¯c1;^c2)=ιBblameℓif ¯c1;^c2⟶∗Failℓ˜v:¯c3if ¯c1;^c2⟶∗¯c3 and ¯c3≠Failℓ
The machine state has the form (s,ρ,κ), where the stack κ is essentially a list of frames. The frames are somewhat unusual in that they include a pending coercion. We also need a pending coercion for the empty stack, so we define stacks as follows. σ::=[]∣(x,s,ρ)::κstacksκ::=(c,σ) The machine has seven transition rules to handle normal execution and two rules to handle cast errors. (x:T=e1(e2);s,ρ,κ)⟼(s′,ρ′[y:=v2],(ιT,(x,s,ρ)::κ))where V(e1,ρ)=⟨λy:T.s′,ρ′⟩,V(e2,ρ)=v2(x:T1=e1(e2);s,ρ,κ)⟼(s′,ρ′[y:=v′2],(c2,(x,s,ρ)::κ))where V(e1,ρ)=⟨λy:T2.s′,ρ′⟩:(c1→c2),V(e2,ρ)=v2, and cast(v2,c1)=v′2(x=op(¯e);s,ρ,κ)⟼(s,ρ[x:=v],κ)where v=δ(op,V(e,ρ))(x=e:c;s,ρ,κ)⟼(s,ρ[x:=v′],κ)where V(e,ρ)=v,cast(v,c)=v′(returne,ρ,(c,(x,s,ρ′)::κ))⟼(s,[x:=v′]ρ′,κ)where V(e,ρ)=v,cast(v,c)=v′(return(e1e2):c1,ρ,(c2,σ))⟼(s,ρ′[y:=v2],(^c3,σ))where V(e1,ρ)=⟨λy:T.s,ρ′⟩,V(e2,ρ)=v2(c1;c2)⟶∗^c3(return(e1e2):c1,ρ,(c2,σ))⟼(s,ρ′[y:=v′2],(^c5,σ))where V(e1,ρ)=⟨λy:T.s,ρ′⟩:(c3→c4),V(e2,ρ)=v2,cast(v2,c3)=v′2, and(c4;c1;c2)⟶∗^c5(x=e:c;s,ρ,κ)⟼blameℓwhere V(e,ρ)=v,cast(v,c)=blameℓ(returne,ρ,(c,σ))⟼blameℓwhere V(e,ρ)=v,cast(v,c)=blameℓ
The space-efficiency of this machine comes from two places. First, the values only ever include a single cast wrapped around a simple value. The cast function maintains this invariant by normalizing whenever a cast is applied to an already-casted value. The second place is the transition rule for tail calls. The coercion in tail position c1 is sequenced with the pending coercion c2 and normalized to c3, which becomes the new pending coercion.
While this machine is space efficient, it is not efficient with respect to time. The reason is that naive coercion reduction is an expensive process: one must search for a redex, reduce it, and plug the contractum back in. Furthermore, the associativity of coercions makes searching for a redex more challenging. In the next post I'll discuss a recursive function that directly maps a sequence of two coercions in normal form to its normal form, using ideas from the paper Threesomes, With and Without Blame I coauthored with Philip Wadler.
No comments:
Post a Comment