This site is supported by donations to The OEIS Foundation.

# S-expressions

Jump to: navigation, search

This article needs more work.

Please help by expanding it!

S-expression (or sexp, for "symbolic expression") is a data structure introduced by John McCarthy in the programming language Lisp[1], which is also used in Scheme[2] [3] [4] and certain other programming languages.

## S-expressions

There are essentially two different syntaxes for representing S-expressions: internal, so-called "dotted-pair notation", which mirrors the way the S-expressions are stored in computer's memory, and external notation, based on nested lists, which is usually used when S-expressions are either input or output from/to the file system or programmer's console. Below, context-free grammars are given for slightly restricted versions of both syntaxes.

First, for the internal "dotted pair"-syntax:

${\displaystyle {\begin{array}{rll}{\textit {S-expression}}&\rightarrow &{\mathit {ATOM}}~~\vert ~~{\mathit {PAIR}}\\{\mathit {PAIR}}&\rightarrow &({\textit {S-expression}}~.~{\mathit {PAIR}})~~\vert ~~({\textit {S-expression}}~.~\mathbf {(~)} )\\\end{array}}}$

and for the external "nested list"-syntax:

${\displaystyle {\begin{array}{rll}{\textit {S-expression}}&\rightarrow &{\mathit {ATOM}}~~\vert ~~({\textit {S-expression}}^{*})\\\end{array}}}$

Here ${\displaystyle \scriptstyle {\mathit {ATOM}}\,}$ may stand for any atomic item, e.g. for numbers and symbols, including also an empty list ${\displaystyle \scriptstyle \mathbf {(~)} \,}$ (which is called NIL in Lisp parlance).

The above definitions produce isomorphic structures. The isomorphism between the two variant syntaxes is realised by the function ${\displaystyle \scriptstyle f\,}$ which maps from the dotted pair to list syntax as follows:

${\displaystyle f(s)={\begin{cases}s,&{\mbox{if }}s~{\mbox{is }}{\mathit {ATOM}}~{\mbox{(including}}\mathbf {~(~)} {\mbox{)}};\\(f({\textit {S-expression}})),&{\mbox{if }}s{\mbox{ is of the form }}({\textit {S-expression}}~.~\mathbf {(~)} );\\{\mbox{a list constructed by inserting }}f({\textit {S-expression}})\\{\mbox{to the front of a list obtained with }}f({\mathit {PAIR}}),&{\mbox{if }}s~{\mbox{is of the form }}({\textit {S-expression}}~.~{\mathit {PAIR}}).\\\end{cases}}}$

As an example, function ${\displaystyle \scriptstyle f\,}$ maps the internal, dotted S-expression ${\displaystyle \scriptstyle (\mathbf {(~)} ~.~({\rm {{1}~.~({\rm {{x}~.~({\rm {{2}~.~\mathbf {(~)} ))))\,}}}}}}}$ to an external, dotless list ${\displaystyle \scriptstyle (\mathbf {(~)} ~{\rm {{1}~{\rm {{x}~{\rm {{2})\,}}}}}}}$ and the internal, dotted S-expression ${\displaystyle \scriptstyle ({\rm {{a}~.~(({\rm {{b}~.~\mathbf {(~)} )~.~\mathbf {(~)} ))\,}}}}}$ to an external, dotless list ${\displaystyle \scriptstyle ({\rm {{a}~({\rm {{b}))\,}}}}}$ of two elements.

## S-expressions with only ( )'s occurring as their terminals

If we restrict our attention to that subset of S-expressions where the only allowed atoms (i.e. terminal nodes) are ${\displaystyle \scriptstyle \mathbf {(~)} \,}$'s, then the above grammars are reduced to:

${\displaystyle {\begin{array}{rll}{\textit {NS-expression}}&\rightarrow &\mathbf {(~)} ~~\vert ~~({\textit {NS-expression}}~.~{\textit {NS-expression}})\end{array}}}$

for the internal syntax, and

${\displaystyle {\begin{array}{rll}{\textit {NS-expression}}&\rightarrow &({\textit {NS-expression}}^{*})\\\end{array}}}$

for the external syntax.

The first grammar thus generates plane binary trees while the second grammar generates balanced parenthesizations, which are two different combinatorial interpretations of Catalan numbers.

## Representing functions whose domains are S-expressions as integer sequences

To represent as an integer sequence a function whose domain (and possibly also its range) is a set of all S-expressions, and whose behaviour does not depend on the particular identity of the atomic elements of S-expression, it is sufficient to record how it acts on the subset defined above, which consists of S-expressions having only ${\displaystyle \scriptstyle \mathbf {(~)} \,}$'s as their terminal nodes. For this, one should fix a total ordering for all such S-expressions (that is, plane binary trees or parenthesizations) so that each can be referred with an unique integer in range ${\displaystyle \scriptstyle [0,\,\infty )\,}$. The sequence A014486 provides us with a very natural ordering, where such structures are ordered in size-wise and lexicographic order.

Thus, to create an integer sequence from a function whose domain is a set of all S-expressions, and whose range is a set of integers or natural numbers, one applies the function to each S-expression constructed from A014486${\displaystyle \scriptstyle (n)\,}$, and records the integer returned by the function as the ${\displaystyle \scriptstyle n\,}$th term of the sequence. See e.g. A057514, A057515, A126306, A126307 and A127284.

There are also several functions, whose range is also a subset of S-expressions, in which case the corresponding integer sequences record the positions of their results in A014486 after the same mapping.