Sets, Tuples, Relations, and Definition by RulesI suspect many readers will already be familiar with sets, tuples, and relations, but if you are not familiar with inductive definitions, then please make sure to read the subsection below titled Definition by Rules.
SetsThe main building block that we use in PL theory is the set, a collection of objects (also called elements), such as the set containing the first three natural numbers: The only thing that matters is whether an object is in the set or not; it doesn't make sense to ask if there are duplicates of an object or in what order the objects appear in the set. For example, the set is the same set as the set listed above. The notation means "in", so is true and is false. Sets may have an infinite number of elements, such as the set of all natural numbers (non-negative integers), written .
TuplesAnother building block is the tuple, which is an ordered collection of objects. So is a tuple of three elements and it is different from the tuple . The subscript notation retrieves the ith element of tuple . For example, if , then . Tuples contain only a finite number of elements and usually less than a handful. Sometimes angle brackets are used for tuples instead of parentheses, such as .
RelationsPutting tuples and sets together we get relations. That is, a relation is a set of tuples. We often use relations to represent a mapping from input to output. For example, the above relation can be thought of as mapping a natural number to its successor, that is, to the next greater natural number. The above definition is rather imprecise because of the elipses (). Fortunately, there are more precise notations for describing infinite sets and relations.
Definition by RulesThe main way that we define infinite sets in PL theory is by giving a list of rules for what is in the set. Let's use the name for the above relation. Then the following two rules give a precise definition of . Notice that the second rule is recursive in that it refers to itself. That's ok and quite common.
- For any natural numbers and , if , then .
Some sets of rules are nonsensical and do not define a set. For example, rules should not be contradictory as in the following.
A common notation for rules such as the above uses a horizontal line in place of "if" and "then". For example, an equivalent definition of is given by the following. We have dropped the "For any natural numbers and " part of rule 2. The convention is that variables such as and that appear in a rule can be replaced by any object of the correct type, in this case, a natural number. Often the "correct type" is something you can deduce from the context of the discussion, in this case, the natural numbers.
Suppose that I claim that a particular element is in , say . You might respond by saying that you don't believe me. To convince you, I need to show you how the rules justify that , that is, I need to show you a derivation. A derivation is a chaining together of rules, replacing variables such as and with particular objects and replacing premises such as with sub-derivations. I've labelled each step in the derivation with the rule number.
The fancy name for what I'm calling Definition by Rules is inductive definition. (My daughter love the Fancy Nancy series of books.)
Language Syntax and GrammarsIt turns out that using rules to defines sets, as we did above, is how we define the syntax of a programming language. Suppose we'd like to define a simple language of integer arithmetic, call it , including expressions such as 1 + 3 and -(5 + 2). Recall that is the set of all integers. Then here's a bunch of rules that we might use to define :
- For any , .
- For any , if , then -.
- For any and , if and , then .
- For any , if , then .
Backus-Naur Form (BNF) is another common notation for writing rules that define the syntax of a language, but the meaning is the same. (There are several variations on BNF; I forget which one I'm using here.) The bunch of rules is referred to as a grammar.
Arith ::= integer Arith ::= "-" Arith Arith ::= Arith "+" Arith Arith ::= "(" Arith ")"A vertical bar (meaning "or") is often used to make such syntax definitions more concise, as follows.
Arith ::= integer | "-" Arith | Arith "+" Arith | "(" Arith ")"In PL theory, we use a peculiar variation on BNF that replaces the name of the language being defined, in this case , with the variable that is used to range over elements of . So suppose we are using the variable as a placeholder for any integer and as a placeholder for elements of . Then we would write the above grammar as Note that I've dropped the parentheses. It's generally understood that parentheses are allowed in any language.
The notion of derivation coincides with that of a parse tree, they both demonstrate why a particular element is in a set.
Operational SemanticsA language is brough to life by defining what it means to run a program in the language, that is, the operational semantics of a language. In the case of , we just need to specify the integer output of each program. As discussed above, relations can be used to map inputs to outputs, and indeed, we typically use relations for this purpose in PL theory. There's several different styles of relations, the first we'll discuss is a big-step style of semantics that maps a program directly to its output.
Big-step SemanticsLet's define a relation that maps elements of to integers. For example, we'd like to have . This relation will be infinite (because there are an infinite number of programs in ), so again we'll use a bunch of rules to define . But before we start, it's common to introduce some shorthand: means . Below we state the rules that define using the horizontal notation. To make sure we don't leave out any programs, we create one rule for each syntactic rule of (there are three). We say that the rules are syntax-directed when there is one rule for each syntactic rule in the language. It may seem a little odd that I'm defining - in terms of , and similarly for +. Isn't that circular? No, the and are the usual arithmetic operators for integers that everyone learned in grade school. In this way, the language is rather odd in not using 32 or 64-bit arithmetic. An implementor of would have to use a big-integer package to properly handle the arithmetic.
Small-step SemanticsThe second, and perhaps more common, style of operational semantics is small-step semantics. In this style, the relation doesn't map a program to its output, but instead it maps a program to a slightly simplified program in which one subexpression has been computed. This style of semantics can be thought of as textual rewriting. To give an example of this style, let's define a relation named . We'll want this relation to have the following elements, among many others: Again, we'll introduce shorthand: means . Also, we'll chain together steps, so means and . The term reduce is a synonym for step. The above example of two steps can now be written as OK, on to the rules that define the relation. There are five rules, which we explain below. Rules (1) and (2) are the most interesting; they perform the arithmetic. We call them computational reduction rules. Rules (3-5) allow us to reach inside of sub-expressions to perform computation. They are often called congruence rules for reasons we won't go into. The use of the variable in rule (5) means that reduction proceeds from left to right. In particular, we're not allowed to reduce the right-hand expression of a plus until we've already reduced the left-hand side to an integer.
Aside: This left-to-right ordering is a choice that I made as the designer of this example language. I could have not specified an ordering, letting it be non-deterministic. However, this example language doesn't have side-effects, so the ordering doesn't matter! However, most language do have side-effects and they do specify an ordering (but not all!), so I thought to include an example of how ordering is typically specified.
Time for an example: let's see the derivation of the step .
We've defined a single step of computation, the relation, but we're not quite done. We still need to specify what it means to run a program to completion. We'll do this by defining a relation in terms of the relation as follows. In plain Engilish, the relation will contain any pair if expression reduces to integer in zero or more steps. Some of the notation here is new and is explained below. The notation is the set builder or set comprehension notation for defining a set. The stuff to the left of the vertical bar is a template for a typical element of the set and the stuff to the right of the vertical bar places restrictions on the elements in the set. The notation means zero or more steps. I like to define this multi-step relation with the following rules: (My brain is good at reasoning about Lisp-style lists, so I think of the first rule as nil and the second rule as cons.)
Type Systems (with the Lambda Calculus as an example)Many programming languages are statically typed, that is, the compiler performs some sanity checking before proceeding with the actual work of compiling. The checking usually involves make sure that objects are only used as intended, for example, not trying to treat an integer as if it were a function. The way a programming language designer (PL theorist) specifies what kind of sanity checking should be performed is by defining a type system for the language. The language is so simple that there is no interesting type checking to be performed. Let's consider a slightly larger language that also happens to be used over and over again in PL theory, the lambda calculus (technically, the simply-typed lambda calculus). The lambda calculus just consists of first-class anonymous functions. Here we'll extend the lambda calculus to also include our arithmetic expressions. So now our example language is defined by the following grammar. The variable ranges over parameter names, such as foo and g. Two expressions right next to each other denote function application (i.e., function call). So if you're familiar with the C language, read as . In the lambda calculus, functions only take one parameter, so function calls only require one argument. The syntax creates a function with one parameter named of type (types will be defined shortly) and whose body is the expression . (A common point of confusion is to think that is the name of the function. It instead is the parameter name. The function is anonymous, i.e. it doesn't have a name.) The return value of the function will be whatever the expression evaluates to.
Now let's consider what kind of objects will be alive when we run the program: there's integers and functions. We'll create a set of types to describe the kinds of objects, using to range over the set of types. In a function type , the is the type of the parameter and is the return type.
The job of a type system is to predict which type of value will be produced by an expression. For example, the expression -(5 + 2) should have the type Int because the result of -(5 + 2) is -3, which is an integer. As with the syntax and operational semantics of a language, PL theorists use relations and rules to define a type system. We'll define a relation named that, as a first approximation, maps expressions to types, so for example, we'll have .
However, because the lambda calculus includes variables, we'll need something analogous to a symbol table, a relation called a type environment, to keep track of which variables have which types. The Greek letter (gamma) is traditionally used for this purpose. We'll need to be able to create new type environments from old ones, potentially overshadowing variable definitions from outer scopes. To set up the mathematical machinery for that, we define to be the relation just like except that any tuple starting with is removed. (The way the type system will be defined, there may be 0 or 1 tuple that starts with , making the type environment a special kind of relation called a partial function.) We'll write for the operation of extending a type environment with variable x, possibly overriding a previous definition, and define it as follows: Suppose we have Then
One way in which type environments are different from the global symbol table in a compiler is that there isn't just one type environment, there will be lots of them, one for each scope. Further, we won't ever update a type environment in place, we'll keep creating new ones that differ a little bit from the old ones. From a programming perspective, the mathematical metalanguage we're using here is pure functional, that is, it doesn't use state or side effects. The reader might worry that this might lead to inefficiency, but remember, we're not writing a program here, we're writing a specification! Clarity is what matters most in this setting, and staying pure helps to make things clear.
Getting back to the relation, instead of containing 2-tuples (pairs) it will contain 3-tuples (triples) of the form , so we'll be assigning types to expressions in the context of a type environment. As yet more shorthand (PL theorists love shorthand!), we'll write instead of . We're now ready to write down the rules that define . To sum up the above rules, the arithmetic operators work on integers, variables get their types from the environment, lambdas are given function types based on their parameter type and their deduced return type, the body of a lambda is checked using the environment from the point of creation (this is lexical scoping) extended with the lambda's parameter, and function application is sane so long as the argument's type is the same as the parameter type.