This site is supported by donations to The OEIS Foundation.

# S-expressions

Jump to: navigation, search

This article needs more work.

Please help by expanding it!

S-expression (or sexp, for "symbolic expression") is a data structure introduced by John McCarthy in the programming language Lisp, which is also used in Scheme   and certain other programming languages.

## S-expressions

There are essentially two different syntaxes for representing S-expressions: internal, so-called "dotted-pair notation", which mirrors the way the S-expressions are stored in computer's memory, and external notation, based on nested lists, which is usually used when S-expressions are either input or output from/to the file system or programmer's console. Below, context-free grammars are given for slightly restricted versions of both syntaxes.

First, for the internal "dotted pair"-syntax:

${\begin{array}{rll}{\textit {S-expression}}&\rightarrow &{\mathit {ATOM}}~~\vert ~~{\mathit {PAIR}}\\{\mathit {PAIR}}&\rightarrow &({\textit {S-expression}}~.~{\mathit {PAIR}})~~\vert ~~({\textit {S-expression}}~.~\mathbf {(~)} )\\\end{array}}$ and for the external "nested list"-syntax:

${\begin{array}{rll}{\textit {S-expression}}&\rightarrow &{\mathit {ATOM}}~~\vert ~~({\textit {S-expression}}^{*})\\\end{array}}$ Here ${\mathit {ATOM}}\,$ may stand for any atomic item, e.g. for numbers and symbols, including also an empty list $\mathbf {(~)} \,$ (which is called NIL in Lisp parlance).

The above definitions produce isomorphic structures. The isomorphism between the two variant syntaxes is realised by the function $f\,$ which maps from the dotted pair to list syntax as follows:

$f(s)={\begin{cases}s,&{\mbox{if }}s~{\mbox{is }}{\mathit {ATOM}}~{\mbox{(including}}\mathbf {~(~)} {\mbox{)}};\\(f({\textit {S-expression}})),&{\mbox{if }}s{\mbox{ is of the form }}({\textit {S-expression}}~.~\mathbf {(~)} );\\{\mbox{a list constructed by inserting }}f({\textit {S-expression}})\\{\mbox{to the front of a list obtained with }}f({\mathit {PAIR}}),&{\mbox{if }}s~{\mbox{is of the form }}({\textit {S-expression}}~.~{\mathit {PAIR}}).\\\end{cases}}$ As an example, function $f\,$ maps the internal, dotted S-expression $(\mathbf {(~)} ~.~({\rm {{1}~.~({\rm {{x}~.~({\rm {{2}~.~\mathbf {(~)} ))))\,}}}}}}$ to an external, dotless list $(\mathbf {(~)} ~{\rm {{1}~{\rm {{x}~{\rm {{2})\,}}}}}}$ and the internal, dotted S-expression $({\rm {{a}~.~(({\rm {{b}~.~\mathbf {(~)} )~.~\mathbf {(~)} ))\,}}}}$ to an external, dotless list $({\rm {{a}~({\rm {{b}))\,}}}}$ of two elements.

## S-expressions with only ( )'s occurring as their terminals

If we restrict our attention to that subset of S-expressions where the only allowed atoms (i.e. terminal nodes) are $\mathbf {(~)} \,$ 's, then the above grammars are reduced to:

${\begin{array}{rll}{\textit {NS-expression}}&\rightarrow &\mathbf {(~)} ~~\vert ~~({\textit {NS-expression}}~.~{\textit {NS-expression}})\end{array}}$ for the internal syntax, and

${\begin{array}{rll}{\textit {NS-expression}}&\rightarrow &({\textit {NS-expression}}^{*})\\\end{array}}$ for the external syntax.

The first grammar thus generates plane binary trees while the second grammar generates balanced parenthesizations, which are two different combinatorial interpretations of Catalan numbers.

## Representing functions whose domains are S-expressions as integer sequences

To represent as an integer sequence a function whose domain (and possibly also its range) is a set of all S-expressions, and whose behaviour does not depend on the particular identity of the atomic elements of S-expression, it is sufficient to record how it acts on the subset defined above, which consists of S-expressions having only $\mathbf {(~)} \,$ 's as their terminal nodes. For this, one should fix a total ordering for all such S-expressions (that is, plane binary trees or parenthesizations) so that each can be referred with an unique integer in range $[0,\,\infty )\,$ . The sequence A014486 provides us with a very natural ordering, where such structures are ordered in size-wise and lexicographic order.

Thus, to create an integer sequence from a function whose domain is a set of all S-expressions, and whose range is a set of integers or natural numbers, one applies the function to each S-expression constructed from A014486$(n)\,$ , and records the integer returned by the function as the $n\,$ th term of the sequence. See e.g. A057514, A057515, A126306, A126307 and A127284.

There are also several functions, whose range is also a subset of S-expressions, in which case the corresponding integer sequences record the positions of their results in A014486 after the same mapping.