Operational Semantics
Operational semantics is an alternative style of semantics that emphasis the steps taken during the execution of the program. We will construct an operational semantics for While programs, building on the denotational semantics for arithmetic and Boolean expressions.
Recall the grammar of programs (i.e. statements) is given as follows:
A statement is an element of the following grammar:
\[S \to \mathsf{skip} \mid x \leftarrow A \mid S; S \mid \mathsf{if}\ B\ \mathsf{then}\ S\ \mathsf{else}\ S \mid \mathsf{while}\ B\ \mathsf{do}\ S\]where $A$ stands for any arithmetic expression and $B$ stands for any Boolean expression.
As with denotational semantics, we will work with the abstract syntax tree of expressions rather than the string that produced them, using $\mathcal{S}$ to refer to the set of statements. Therefore, we needn’t consider parentheses or braces as part of this grammar.
The type of operational semantics that we will employ is sometimes called big-step operational semantics. This framework involves defining a relation $S,\, \sigma \Downarrow \sigma’$ which says that the program $S \in \mathcal{S}$ in the initial state $\sigma$ executes leaving a final state $\sigma’$. We are using an infix notation for this relation; equally, we could have written ${\Downarrow} \subseteq \mathcal{S} \times \mathsf{State} \times \mathsf{State}$. The name big-step semantics refers to the fact that this relation doesn’t expose any intermediary states of the computation - only the initial state $\sigma \in \mathsf{State}$ and the final state $\sigma’ \in \mathsf{State}$.
If you squint, it may seem that this is quite similar to the denotational setup except we’re using a relation instead of a function and the result is a state rather than a value. The difference lies in the way these relations/functions are defined. Recall that the denotational semantics was given by recursion over the expression. For the operational semantics, we will define the relation inductively.
Inference Systems & Derivations
Already in this course, we have met some inductive defined sets. In particular, the natural numbers are inductively defined as are the set of arithmetic and Boolean expressions.
Recall the grammar defining the natural numbers:
\[N \to 0 \mid N + 1\]Another way of describing this set is as the least set $N$ such that: (1) $0 \in N$; and, (2) if $n \in N$ is a natural number, then $n + 1 \in N$ is also a natural number. Both of these clauses can be thought of as rules for generating the inhabitants of our set. In particular, the first rule says that $0$ is included, and the second rule allows us to generate new instances from existing ones.
An Aside
You might ask why it has to be the least set in particular. Actually, if this were not specified, then you could include quite unnatural numbers such as $-1$. Although these values aren’t specifically generated by the grammar, they don’t violate the previous description as long as we also include $-1 + 1$ and $-1 + 1 + 1$ etc. to satisfy the last clause.
Other inductively defined sets work similarly, consisting of a set of rules for building up the set of instances by combining and modifying existing instances. However, not all inductively defined sets are given as grammars. More general, inductively defined sets are given via a set of inference rules from which the set is generated.
An inference rule is a logical statement of the form:
\[\dfrac {P_1 \quad \cdots \quad P_n} {Q}\]That is interpreted as the statement: if $P_1$ and $P_2$ and … and $P_n$ are all true, then $Q$ is also true. We refer to each $P_1$, …, $P_n$ as premises and to $Q$ and the conclusion.
An inductively defined set $S$ is the least set to satisfy a set of inference rules with premises and conclusion of the form $e \in S$.
Using this notation for inference rules, we express the definition of the natural numbers $N$ as follows:
\[\begin{array}{cc} \dfrac {} {0 \in N} & \dfrac {n \in N} {n + 1 \in N} \end{array}\]When undefined symbols (e.g. “$n$”) appear in the premises and conclusion of an inference rule, they are universally quantified. That means the rule applies for all concrete instances. For example, the second inference rule allows us to conclude that $0 + 1 \in N$ given that $0 \in N$ where we have instantiated the symbol $n$ for the concrete value $0$. These symbols are referred to as metavariables. To write down these concrete instance of inference rules - referred to as inferences – we use the same notation, e.g.:
\[\begin{array}{cc} \dfrac {0 \in N} {0 + 1 \in N} & \dfrac {0 + 1 \in N} {0 + 1 + 1 \in N} \end{array}\]One of the nice features of this notation is that multiple inferences can be glued together to present a consistent argument. For example, we can combine the above two inferences and the inference rule for $0 \in N$ to create the following argument:
\[\dfrac { \dfrac { \dfrac {} {0 \in N} } {0 + 1 \in N} } {0 + 1 + 1 \in N}\]These compound inferences are referred to as derivations. In general, derivations have the structure of a tree, rather than a sequence of rules, to account for the possibility of inference rules with multiple premises.
Informally, a derivation tree is a tree where each node is associated by an instance of an inference rule such that, for each premise, there is a child node with that conclusion. In particular, the leaves of the tree are associated by inference rules with no premises.
The idea of a derivation is common between inference systems and grammars. When talking about a grammar, the derivations are used to define the language: a string is an element of the language if, and only if, there exists a derivation witnessing this. Derivations work analogously with inference systems: an inductively defined set is inhabited by a particular element if, and only if, there exists a derivation witnessing this.
Skip, Assignment, and Composition
In the same way that two aforementioned rules are used to define the natural numbers inductively, we will use similar rules to define the operational semantics of While programs. That is, we will define the relation ${\Downarrow} \subseteq \mathcal{S} \times \mathsf{State} \times \mathsf{State}$ as the least set to satisfy certain rules that describe the behaviour of each language construct. Let’s start with the rules for skip, assignment, and composition:
-
The skip statement $\mathsf{skip}$ does not affect the program state. Therefore, if the program $\mathsf{skip}$ starts in the state $\sigma$, it will end in the state $\sigma$. We characterise this behaviour as the inference rule:
\[\dfrac {} {\mathsf{skip},\, \sigma \Downarrow \sigma}\]Here we are using our convenient infix notation for the relation, but this rule should be read as the logical statement: for all states $\sigma \in \mathsf{State}$, $(\mathsf{skip},\, \sigma,\, \sigma) \in {\Downarrow}$.
Notice the metavariable $\sigma$ that refers to a placeholder for an arbitrary state, rather than any particular state. This can be thought of as a parameter of the rule that can instantiated with a particular state of interest to get concrete instances. For example, the rule allows us to conclude that $\mathsf{skip},\, [x \mapsto 2] \Downarrow [x \mapsto 2]$ where the metavariable $\sigma$ is instantiated with the concrete state $[x \mapsto 2]$.
-
The assignment statement $x \leftarrow e$ executes in a given state $\sigma$ by evaluating the arithmetic expression $e$ in this state, and updating the state so that $x$ receives this value.
As with the $\mathsf{skip}$ statement, the behaviour of this statement can be described without making reference to other statements as it is not a compound statement (i.e. it is a base case). Therefore, we give the inference rule for its execution directly.
\[\dfrac {} {x \leftarrow e,\, \sigma \Downarrow \sigma[x \mapsto \llbracket e \rrbracket_A(\sigma)]}\]where the notation $\sigma[x \mapsto n]$ refers to the state that results from updating the value assigned to $x$ to be $n$.
Here, the variable $x$, the arithmetic expression $e$, and the state $\sigma$ are metavariables that are universally quantified by this rule. Logically, it corresponds to the statement: for all variables $x \in \mathsf{Var}$, arithmetic expressions $e \in A$, and states $\sigma \in \mathsf{State}$, $x \leftarrow e,\, \sigma \Downarrow \sigma[x \mapsto \llbracket e \rrbracket_A(\sigma)]$.
As with the rule for skip statements, we can instantiate these metavariables with a concrete variable, a concrete arithmetic expression, and a concrete state to get concrete instances of this rule. For example, the rule implies that $x \leftarrow 1,\, [x \mapsto 2] \Downarrow [x \mapsto 1]$ where we have instantiated the rule with the variable $x$, the arithmetic expression $1$, and the state $[x \mapsto 2]$.
-
The next example we will look at is the rule governing the operational semantics of composition $S_1 ; S_2$. In this case, the behaviour of the compound statement is precisely the behaviour of $S_1$ followed by the behaviour of the statement $S_2$. Therefore, the rule for this construct depends on the operational semantics of the two sub-statements $S_1$ and $S_2$. The rule is as follows:
\[\dfrac { S_1,\, \sigma \Downarrow \sigma' \quad S_2,\, \sigma' \Downarrow \sigma'' } {S_1; S_2,\, \sigma \Downarrow \sigma''}\]This rule says that if the relation $\Downarrow$ already includes the tuples $(S_1,\, \sigma,\, \sigma’)$ and $(S_2,\, \sigma’,\, \sigma’’)$, then it must also include the triple $(S_1; S_2,\, \sigma,\, \sigma’’)$. Notice that the form of this rule is very similar to the inductive rule for natural numbers in that new instances are generated from existing instance, i.e. if $n \in N$, then $n+1 \in N$.
As with the two previous rules, the rule applies for all statements $S_1,\, S_2 \in S$ and all states $\sigma,\, \sigma’,\, \sigma’’ \in \mathsf{State}$ - these are the rules metavariables. In order to use this rule, however, we need not only to instantiate metavariables but to find inhabitants of the $\Downarrow$ that determine the behaviour of the statements $S_1$ and $S_2$. Once we have done so, we can drive a new inhabitant describing the behaviour of the compound statement $S_1 ; S_2$. For example, if we already know that $x \leftarrow 2,\, [x \mapsto 1] \Downarrow [x \mapsto 2]$ and $x \leftarrow 3,\, [x \mapsto 2] \Downarrow [x \mapsto 3]$, then we can combine these facts to conclude that: $ x \leftarrow 2; x \leftarrow 3,\, [x \mapsto 1] \Downarrow [x \mapsto 3] $
In order to justify our assumptions $x \leftarrow 2,\, [x \mapsto 1] \Downarrow [x \mapsto 2]$ and $x \leftarrow 3,\, [x \mapsto 2] \Downarrow [x \mapsto 3]$, we will have to apply more rules; in this case, the assignment rule.
As we can see some rules, such as the rules for skip and assignment, can be used in isolation and do not depend on the behaviour of other statements. Other rules require us to already know the behaviour of some statements in order to characterise their behaviour. These two types of rules behave much like the base case and inductive cases of the natural numbers. And, as with the natural numbers or other grammars, we can apply inductive cases multiple times to derive more complex instances.
Computing with Inference Rules
We can use these inference rules to compute the behaviour of a given program. What we mean by this is: given a program $S \in \mathcal{S}$ and a state $\sigma \in \mathsf{State}$, find a state $\sigma’ \in \mathsf{State}$ such that $S,\, \sigma \Downarrow \sigma’$. We refer to $\sigma$ as the initial state and $\sigma’$ as the final state in such cases.
For example, to compute the behaviour program $x \mapsto 1; y \mapsto 2$ according to our operational semantics starting in the initial state $[]$ (i.e. where all variables are zero), we are asking what final states $\sigma$ are there (if any) such $x \mapsto 1; y \mapsto 2,\, [] \Downarrow \sigma$.
Rather than trialling final states at random, we can approach such a question in a more strategic way. Ultimately, we are aiming for the statement $x \mapsto 1; y \mapsto 2,\, [] \Downarrow \sigma$ to be true for some $\sigma$. As the $\Downarrow$ relation is defined inductively, each instance must come with a derivation. Therefore, we can reframe the problem for looking for a derivation of $x \mapsto 1; y \mapsto 2,\, [] \Downarrow \sigma$ for some $\sigma$.
Key Idea
This process is the essence of big-step operational semantics: computation as modelled through the construction of a derivation.
A derivation of the statement $x \mapsto 1; y \mapsto 2,\, [] \Downarrow \sigma$ for some $\sigma$ clearly must start with the rule for composition as no other inference rules could apply. That is, $x \mapsto 1; y \mapsto 2,\, [] \Downarrow \sigma$ could only be true if we found another state $\sigma’$ such that $x \mapsto 1,\, [] \Downarrow \sigma’$ and $y \mapsto 2,\, \sigma’ \Downarrow \sigma$. This observation allows us to break down our problem into sub-problems. So far, we have constructed the partial derivation:
\[\dfrac { x \mapsto 1,\, [] \Downarrow \sigma' \quad y \mapsto 2,\, \sigma' \Downarrow \sigma } {x \mapsto 1; y \mapsto 2,\, [] \Downarrow \sigma}\]where $\sigma$ and $\sigma’$ are yet to be determined.
Next, we turn to the sub-problems, i.e. the premises of our partial derivation, and see which inference rules can be used to solve them. The first sub-problem requires us to find a state $\sigma’$ such that $x \mapsto 1,\, [] \Downarrow \sigma’$. As before, we can see that only one rule is applicable - the rule for assignment. The rule for assignment tells us that $\sigma’$ must be the state $[x \mapsto 1]$. Now we may finally solve for $\sigma$ by considering the second sub-problem: $y \mapsto 2,\, \sigma’ \Downarrow \sigma$. We know that $\sigma’$ is $[x \mapsto 1]$ and thus the rule for assignment tell us that $\sigma$ is $[x \mapsto 1,\, y \mapsto 2]$. This gives us a final state of the program and the corresponding derivation:
\[\dfrac { \dfrac {} {x \mapsto 1,\, [] \Downarrow [x \mapsto 1]} \quad \dfrac {} {y \mapsto 2,\, [x \mapsto 1] \Downarrow [x \mapsto 1,\, y \mapsto 2]} } {x \mapsto 1; y \mapsto 2,\, [] \Downarrow [x \mapsto 1,\, y \mapsto 2]}\]Conditional Rules
If Statement
Let’s now turn to the rules for conditional statements. The behaviour of an if statement naturally depends on the value of its branching condition. If the condition is true, the first statement will be executed; otherwise, the second statement will be executed. Correspondingly, there are two inference rules that cover each case.
\[\begin{array}{cc} \dfrac {S_1,\, \sigma \Downarrow \sigma'} {\mathsf{if}\ e\ \mathsf{then}\ S_1\ \mathsf{else}\ S_2,\, \sigma \Downarrow \sigma'} \llbracket e \rrbracket_B(\sigma) = \top & \dfrac {S_2,\, \sigma \Downarrow \sigma'} {\mathsf{if}\ e\ \mathsf{then}\ S_1\ \mathsf{else}\ S_2,\, \sigma \Downarrow \sigma'} \llbracket e \rrbracket_B(\sigma) = \bot \end{array}\]To the left of each of these inference rules is a condition known as the side condition. The side condition is a caveat to the rule that restricts the instances in which it applies. For example, the first side should be read as: for any Boolean expression $e$, statements $S_1$ and $S_2$, and states $\sigma$ and $\sigma’$ such that $\llbracket e \rrbracket_B(\sigma) = \top$, if $S_1,\, \sigma \Downarrow \sigma’$ then $\mathsf{if}\ e\ \mathsf{then}\ S_1\ \mathsf{else}\ S_2,\, \sigma \Downarrow \sigma’$. This means that we can use this rule to derive the final state if the branch condition $x \leq 2$ and the initial state were $[x \mapsto 0]$ for example.
The side conditions are treated differently from premises as they constrain the instances a metavariable may take, rather than relating to other instances of the inductively defined relation.
While Statement
Finally, let’s look at while statements. As with conditionals, the behaviour of a while statement $\mathsf{while}\ e\ \mathsf{do}\ S$ is characterised by two cases:
-
If the branch condition $e$ is not true, then the statement does nothing.
-
If, on the other hand, the branch condition $e$ is true, then the statement executes the sub-statement $S$ before repeating the process.
Correspondingly, there are two inference rules:
\[\begin{array}{cc} \dfrac {} {\mathsf{while}\ e\ \mathsf{do}\ S, \sigma \Downarrow \sigma} \llbracket e \rrbracket_B(\sigma) = \bot & \dfrac {S,\, \sigma \Downarrow \sigma' \quad \mathsf{while}\ e\ \mathsf{do},\, \sigma' \Downarrow \sigma''} {\mathsf{while}\ e\ \mathsf{do}\ S,\, \sigma \Downarrow \sigma''} \llbracket e \rrbracket_B(\sigma) = \top \end{array}\]The side condition on the first rule ensures that it is only applicable when the branch condition evaluates to false in the initial state. Additionally, this rule does not change the program’s state.
Conversely, the side condition on the second rule ensures that it is only applicable when the branch condition evaluates to true in the initial state. In this case, the body of the $\mathsf{while}$ loop is evaluate to produce a new intermediary state, before the loop itself is reevaluated. This rule encodes the iterative nature of the $\mathsf{while}$ construct by defining its behaviour in a recursive manner, i.e. if the branch condition is true, the loop behaves as its body followed by the loop itself.
Computing with Conditional Inference Rules
The new inference rules can be used to compute the behaviour of While programs using the same principle — for a given initial state $\sigma \in \mathsf{State}$ and a program $S \in \mathcal{S}$, find a final state $\sigma’ \in \mathcal{State}$ such that $S,\, \sigma \Downarrow \sigma’$. The only difference with the conditional language constructs is that the applicable inference rule depends not just on the structure of the statement but also on the value of some branching condition in the given state.
For example, let’s consider evaluating the program $\mathsf{while}\ x \leq 1\ \mathsf{do}\ x \leftarrow x + 2$ in the state $[x \mapsto 0]$. By the form of the program, we know that one of the inference rules for $\mathsf{while}$ must apply. In this case, as $\llbracket x \leq 1 \rrbracket_\mathcal{A}([x \mapsto 0])$ is true, the inference rule in question will be:
\[\dfrac {s,\, \sigma \Downarrow \sigma' \quad \mathsf{while}\ e\ \mathsf{do},\, \sigma' \Downarrow \sigma''} {\mathsf{while}\ e\ \mathsf{do}\ s,\, \sigma \Downarrow \sigma''} \llbracket e \rrbracket_B(\sigma) = \top\]This tell us that the final state is computed in two steps: first, we must evaluate the statement $x \leftarrow x + 2$ in our initial state $[]$ to compute an intermediary state, then we evaluate the statement $\mathsf{while}\ x \leq 1\ \mathsf{do}\ x \leftarrow x + 2$ in the resulting state. According to the rule for assignment, the intermediary state will be $[x \mapsto 2]$. Thus far we have constructed the following partial derivation:
\[\dfrac { \dfrac {} {x \leftarrow x + 2\, [] \Downarrow [x \mapsto 2]} \quad \dfrac {\vdots} {\mathsf{while}\ x \leq 1\ \mathsf{do},\, [x \mapsto 2] \Downarrow \sigma'} } {\mathsf{while}\ x \leq 1\ \mathsf{do}\ x \leftarrow x + 2,\, [] \Downarrow \sigma'}\]To complete our derivation, we must evaluate the same statement $\mathsf{while}\ x \leq 1\ \mathsf{do}$ in the new state $[x \mapsto 2]$. In this case, the branch condition is false and therefore the other rules for $\mathsf{while}$ statements applies. Therefore, we may determine the final state to be $[x \mapsto 2]$, and we complete our derivation as follows:
\[\dfrac {\dfrac {} {x \leftarrow x + 2\, [] \Downarrow [x \mapsto 2]} \quad \dfrac {} {\mathsf{while}\ x \leq 1\ \mathsf{do},\, [x \mapsto 2] \Downarrow [x \mapsto 2]} } {\mathsf{while}\ x \leq 1\ \mathsf{do}\ x \leftarrow x + 2,\, [] \Downarrow [x \mapsto 2]}\]