\[ \newcommand{\tr}{\Rightarrow} \newcommand{\trs}{\tr^{\!\ast}} \newcommand{\rlnm}[1]{\mathsf{(#1)}} \newcommand{\rred}[1]{\xrightarrow{#1}} \newcommand{\rreds}[1]{\mathrel{\xrightarrow{#1}\!\!^*}} \newcommand{\cl}{\mathsf{Cl}} \newcommand{\pow}{\mathcal{P}} \newcommand{\matches}{\mathrel{\mathsf{matches}}} \newcommand{\kw}[1]{\mathsf{#1}} \newcommand{\andop}{\mathrel{\&\!\&}} \newcommand{\orop}{\mathrel{\|}} \newcommand{\ff}{\mathsf{false}} \newcommand{\tt}{\mathsf{true}} \newcommand{\abra}[1]{\langle #1 \rangle} \newcommand{\bnfnt}[1]{\abra{\small \textsf{#1}}} \newcommand{\llbracket}{[\![} \newcommand{\rrbracket}{]\!]} \newcommand{\first}{\mathsf{First}} \newcommand{\nullable}{\mathsf{Nullable}} \newcommand{\follow}{\mathsf{Follow}} \newcommand{\tm}[1]{\mathsf{#1}} \]

Encoding first-order data

The main way to encode a set $S$ of data as natural numbers is to construct a bijection $S \xrightarrow{\cong} \mathbb{N}$.

We have already seen how to construct a bijection $\beta : \mathbb{Z} \xrightarrow{\cong} \mathbb{N}$, so that any integer may be encoded as a natural number. We will now see how to encode other kinds of first-order data (e.g. numbers, lists, trees, etc.) as natural numbers.

Encoding pairs of numbers

A pairing function is a bijection $\mathbb{N} \times \mathbb{N} \xrightarrow{\cong} \mathbb{N}$.

It is not immediately evident that a pairing function might exist: why should it be possible to put $\mathbb{N}$ and $\mathbb{N} \times \mathbb{N}$ in exact correspondence with each other? Does not the product set $\mathbb{N} \times \mathbb{N}$ have somehow ‘infinitely more’ elements than $\mathbb{N}$?

Nevertheless, a variety of pairing functions exists.

From this point onwards we assume that we have a fixed pairing function

\[\langle -, - \rangle : \mathbb{N} \times \mathbb{N} \xrightarrow{\cong} \mathbb{N}\]

Moreover, we write

\[\textsf{split} : \mathbb{N} \xrightarrow{\cong} \mathbb{N} \times \mathbb{N}\]

for its inverse.

Encoding lists

Once we have a pairing function $\langle -, - \rangle$ and its inverse $\textsf{split}$ we can use them to encode all sorts of other data.

For example, we may encode lists of natural numbers as numbers. The set of all lists of natural numbers, $\mathbb{N}^\ast$, is defined by the following rules.

\[\frac{}{[] \in \mathbb{N}^\ast} \quad \frac{ n \in \mathbb{N} \quad ns \in \mathbb{N}^\ast }{ (n : ns) \in \mathbb{N}^\ast }\]

To encode such a list as a number we define a function $\phi_\ast : \mathbb{N}^\ast \to \mathbb{N}$ by induction:

\[\begin{aligned} \phi_\ast([]) &= 0 \\ \phi_\ast(n:ns) &= 1 + \langle n, \phi_\ast(ns) \rangle \end{aligned}\]

This function is a bijection. They key to writing down the inverse is to notice that all operations used in its definition ($\phi$, $+1$) are invertible. We define $\phi_\ast^{-1} : \mathbb{N} \to \mathbb{N}^\ast$ by

\[\begin{aligned} \phi_\ast^{-1}(n) = \begin{cases} [] & \text{ if $n = 0$} \\ x : \phi_\ast^{-1}(m) & \text{ if $n > 0$ and $(x, m) = \textsf{split}(n - 1)$} \end{cases} \end{aligned}\]

Other algebraic data types (e.g. binary trees) may be encoded as natural numbers by following the same pattern.