Loading [MathJax]/jax/output/HTML-CSS/jax.js

Sunday, October 15, 2017

New revision of the semantics paper (POPL rejection, ESOP submission)

My submission about declarative semantics to POPL was rejected. It's been a few weeks now, so I'm not so angry about it anymore. I've revised the paper and will be submitting it to ESOP this week.

The main reason for rejection according to the reviewers was a lack of technical novelty, but I think the real reasons were that 1) the paper came across as too grandiose and as a result, it accidentally annoyed the reviewer who is an expert in denotational semantics, and 2) the paper did not do a good job of comparing to the related set-theoretic models of Plotkin and Engeler.

Regarding 1), in the paper I use the term "declarative semantics" to try and distance this new semantics from the standard lattice-based denotational semantics. However, the reviewer took it to claim that the new semantics is not a denotational semantics, which is clearly false. In the new version of the paper I've removed the term "declarative semantics" and instead refer to the new semantics as a denotational semantics of the "elementary" variety. Also, I've toned down the sales pitch to better acknowledge that this new semantics is not the first elementary denotational semantics.

Regarding 2), I've revised the paper to include a new section at the beginning that gives background on the elementary semantics of Plotkin, Engeler, and Coppo et al. This should help put the contributions of the paper in context.

Other than that, I've added a section with a counter example to full abstraction. A big thanks to the POPL reviewers for the counter example! (Also thanks to Max New, who sent me the counter example a couple months ago.)

Unfortunately, the ESOP page limit is a bit shorter, so I removed the relational version of the semantics and also the part about mutable references.

A draft of the revision is available on arXiv. Feedback is most welcome, especially from experts in denotational semantics! I really hope that this version is no longer annoying, but if it is, please tell me!

Tuesday, October 03, 2017

Comparing to Plotkin and Engeler's Set-theoretic Models of the Lambda Calculus

On the plane ride back from ICFP last month I had a chance to re-read and better understand Plotkin’s Set-theoretical and other elementary models of the λ-calculus (Technical Report 1972, Theoretical Computer Science 1993) and to read, for the first time, Engeler’s Algebras and combinators (Algebra Universalis 1981). As I wrote in my draft paper Declarative semantics for functional languages: compositional, extensional, and elementary, the main intuitions behind my simple semantics are present in these earlier papers, but until now I did not understand these other semantics deeply enough to give a crisp explanation of the similarities and differences. (The main intuitions are also present in the early work on intersection type systems, and my semantics is more closely related to those systems. A detailed explanation of that relationship is given in the draft paper.)

I should note that Engeler’s work was in the context of combinators (S and K), not the λ-calculus, but of course the λ-calculus can be encoded into combinators. I’ve ported his definitions to the λ-calculus, along the lines suggested by Plotkin (1993), to make for easier comparison. In addition, I’ll extend both Engeler and Plotkin’s semantics to include integers and integer arithmetic in addition to the λ-calculus. Here’s the syntax for the λ-calculus that we consider here: nZxX(program variables)::=+×÷Ee::=neexλx.eeeifetheneelsee

Values

Perhaps the best place to start the comparison is in the definition of what I’ll call values. All three semantics give an inductive definition of values and all three involve finite sets, but in different ways. I’ll write VS for my definition, VP for Plotkin’s, and VE for Engeler’s. VS=Z+Pf(VS×VS)VP=Z+Pf(VP)×Pf(VP)VE=Z+Pf(VE)×VE In VS, a function is represented as a finite graph, that is, a finite set of input-output pairs. For example, the graph {(0,1),(1,2),(2,3)} is one of the meanings for the term (λx.x+1).

Plotkin’s values VP include only a single input-output pair from a function’s graph. For example, ({0},{1}) is one of the meanings for the term (λx.x+1). Engeler’s values also include just a single entry. For example, ({0},1) is one of the meanings for the term (λx.x+1). In this example we have not made use of the finite sets in the input and output of Plotkin’s values. To do so, let us consider a higher-order example, such as the term (λf.f1+f2). For Plotkin, the following value is one of its meanings: ({({1},{3}),({2},{4})},{7}) That is, in case f is the function that adds 2 to its input, the result is 7. We see that the presence of finite sets in the input is needed to accomodate functions-as-input. The corresponding value in VS is {({(1,3),(2,4)},7)}

The difference between Plotkin and Engeler’s values can be seen in functions that return functions. Consider the K combinator (λx.λy.x). For Plotkin, the following value is one of its meanings: ({1},{({0},{1}),({2},{1})}) That is, when applied to 1 it returns a function that returns 1 when applied to either 0 or 2. The corresponding value in VS is {(1,{(0,1),(2,1)})} For Engeler, there is not a single value corresponding to the above value. Instead it requires two values to represent the same information. ({1},({0},1))and({1},({2},1)) We’ll see later that it doesn’t matter that Engeler requires more values to represent the same information.

The Domains

The semantics of Plotkin, Engeler, and myself does not use values for the domain, but instead a set of values. That is P(VS)P(VP)P(VE)

The role of the outer P is intimately tied to the meaning of functions in Plotkin and Engeler’s semantics because the values themselves only record a single input-output pair. The outer P is needed to represent all of the input-output pairs for a given function. While the P is also necessary for functions in my semantics, one can view it generically as providing non-determinism and therefore somewhat orthogonal to the meaning of functions per se. Next let’s take a look at the semantics.

Comparing the Semantics

Here is Plotkin’s semantics EP. Let V,V range over finite sets of values. EP[[n]]ρ={n}EP[[e1e2]]ρ={n1n2n1EP[[e1]]ρn2EP[[e2]]ρ}EP[[x]]ρ=ρ(x)EP[[λx.e]]ρ={(V,V)VEP[[e]]ρ(x:=V)}EP[[e1e2]]ρ={V|V.(V,V)EP[[e1]]ρVEP[[e2]]ρ}EP[[ife1thene2elsee3]]ρ={v|n.nEP[[e1]]ρ(n0vEP[[e2]]ρ)(n=0vEP[[e3]]ρ)} For Plotkin, the environment ρ maps variables to finite sets of values. In the case for application, the input set V must be a subset of the meaning of the argument, which is critical for enabling self application and, using the Y combinator, general recursion. The flattens the set-of-finite-sets into a set.

Next we consider Engeler’s semantics EE. EE[[n]]ρ={n}EE[[e1e2]]ρ={n1n2n1EE[[e1]]ρn2EE[[e2]]ρ}EE[[x]]ρ=ρ(x)EE[[λx.e]]ρ={(V,v)vEE[[e]]ρ(x:=V)}EE[[e1e2]]ρ={v|V.(V,v)EE[[e1]]ρVEE[[e2]]ρ}EE[[ife1thene2elsee3]]ρ={v|n.nEE[[e1]]ρ(n0vEE[[e2]]ρ)(n=0vEE[[e3]]ρ)} The semantics is quite similar to Plotkin’s, as again we see the use of in the case for application. Because the output v is just a value, and not a finite set of values as for Plotkin, there is no need for the .

Finally we review my semantics ES. For it we need to define an ordering on values that is just equality for integers and on function graphs. Let t range over Pf(V×V). nnt1t2t1t2 Then we define ES as follows. ES[[n]]ρ={n}ES[[e1e2]]ρ={n1n2n1ES[[e1]]ρn2ES[[e2]]ρ}ES[[x]]ρ={vvρ(x)}ES[[λx.e]]ρ={t(v,v)t.vES[[e]]ρ(x:=v)}ES[[e1e2]]ρ={v|tv2v3v3.tES[[e1]]ρv2ES[[e2]]ρ(v3,v3)tv3v2vv3}ES[[ife1thene2elsee3]]ρ={v|n.nES[[e1]]ρ(n0vES[[e2]]ρ)(n=0vES[[e3]]ρ)} In my semantics, ρ maps a variable to a single value. The v3v2 in my semantics corresponds to the uses of in Plotkin and Engeler’s. One can view this as a kind of subsumption, allowing the use of a larger approximation of a function in places where a smaller approximation is needed. I’m not sure whether all the other uses of are necessary, but the semantics needs to be downward closed, and the above placement of ’s makes this easy to prove.

Relational Semantics

For people like myself with a background in operational semantics, there is another view of the semantics that is helpful to look at. We can turn the above dentoational semantics into a relational semantics (like a big-step semantics) that hides the P by making use of the following isomorphism (where V is one of VS, VP, or VE). E(XV)P(V)E×(XV)×V Let v range over V. We can define the semantic relation ρSev that corresponds to ES as follows. Note that in the rule for lambda abstraction, the table t comes out of thin air (it is existentially quantified), and that there is one premise in the rule per entry in the table, that is, we have the quantification (v,v)t. ρSnnρSe1n1ρSe2n2ρSe1e2n1n2vρ(x)ρSxv(v,v)t.ρ(x:=v)SevρSλx.etρSe1tρSe2v2(v3,v3)tv3v2vv3ρS(e1e2)vρSe1nn0ρSe2vρSife1thene2elsee3vρSe10ρSe3vρSife1thene2elsee3v

For comparison, let us also turn Plotkin’s semantics into a relation. ρPnnρPe1n1ρPe2n2ρPe1e2n1n2vρ(x)ρPxvvV.ρ(x:=V)PevρPλx.e(V,V)ρPe1(V,V)v2V.ρPe2v2vVρP(e1e2)vρPe1nn0ρPe2vρPife1thene2elsee3vρPe10ρPe3vρPife1thene2elsee3v Recall that in Plotkin’s semantics, the environment maps variables to finite sets of values. The “set” is needed to handle the case of a function bound to a variable, but is just extra baggage when we have an integer bound to a variable. So in the variable rule we have vρ(x), which either extracts a singleton integer from ρ(x), or extracts one input-output entry from a function’s graph. Moving on to the lambda rule, it only produces one input-output entry, but to handle the case when the output V is representing a function, we must build it up one entry at a time with the quantification vV and a finite but arbitrary number of premises. In the application rule we again have a finite number of premises, with v2V, and also the premise vV.

The relational version of Engeler’s semantics removes the need for quantification in the lambda rule, but the application rule still has v2V. ρ(x:=V)EevρEλx.e(V,v)ρEe1(V,v)v2V.ρEe2v2ρE(e1e2)v

Conclusion

My semantics is similar to Plotkin and Engeler’s in that

  1. The domain is a set of values, and values are inductively defined and involve finite sets.

  2. Self application is enabled by allowing a kind of subsumption on functions.

The really nice thing about all three semantics is that they are simple; very little mathematics is necessary to understand them, which is important pedagogically, practically (easier for practitioners to apply such semantics), and aesthetically (Occam’s razor!).

My semantics is different to Plotkin and Engeler’s in that

  1. the definition of values places Pf so that functions are literally represented by finite graphs, and

  2. environments map each variable to a single value, and

  3. is used instead of to enable self application.

The upshot of these (relatively minor) differences is that my semantics may be easier to understand.