In 1971, J.H. Conway introduced the notion of the factors of a language and the factor matrix of a regular language. Shortly afterwards, the author extended Conway's factor theory in several ways. The existence of a unique, minimal starth root of the factor matrix, dubbed the factor graph of the language, was established. It was also proved that the factor matrix of a factor of a language is a submatrix of the factor matrix of the language, and the factor graph of a factor of a language is pathwise homomomorphic to the factor graph of the language. The latter result was used to derive an algorithm for computing the factor matrix of a language from its factor graph with the property that the resulting regular expressions have star-height at most the cycle-rank of the factor graph, and could be strictly smaller. Using the simple device of naming the factor operators, the current paper revisits Conway's and our own work on factor theory in a calculational style. We give explicit constructions of the factor matrix, the factor graph, submatrices of the factor matrix defined by subfactors and pathwise homomorphisms of factor graphs. We also extend factor theory beyond this earlier work. We formulate the theory more abstractly so that we can rigorously justify the use of the syntactic monoid of a language in calculations involving factors. We also present Conway's theory of approximations of regular languages but extended to this more general abstract context. When specialised to regular languages, we prove the existence of a unique minimal approximating function and compare this with Conway's maximal constant+linear approximating function. The closure algorithm we present does not always construct regular expressions of minimal star-height. However, we speculate in the conclusions on how the theory of approximations might be exploited to develop a novel, effective approach to solving the star-height problem.

The theory of factors of a regular language is used to illustrate the unity-of-opposites theorem of Galois connections. Left and right factors of a language are characterised as unions of right- and left-invariant equivalence classes, respectively, and this characterisation is exploited in the construction of the factor graph. The factor graph is a representation of the poset of left factors and, isomorphically by the unity of opposites, the poset of right factors. Two illustrative examples are given, one of which is the so-called failure function used in the Knuth-Morris-Pratt pattern matching algorithm.

The torch problem (also known as the bridge problem or the flashlight problem) is about getting a number of people across a bridge as quickly as possible under certain constraints. Although a very simply stated problem, the solution is surprisingly non-trivial. The case in which there are just four people and the capacity of the bridge is two is a well-known puzzle, widely publicised on the internet. We consider the general problem where the number of people, their individual crossing times and the capacity of the bridge are all input parameters. We present two methods to determine the shortest total crossing time: the first expresses the problem as an integer-programming problem that can be solved by a standard linear-programming package, and the second expresses the problem as a shortest-path problem in an acyclic directed graph, i.e.\ a dynamic-programming solution. The complexity of the integer-programming solution is difficult to predict; its main purpose is to act as an independent test of the correctness of the results returned by the second solution method. The dynamic-programming solution has best- and worst-case time complexity proportional to the square of the number of people. An empirical comparison of the efficiency of both methods is also presented. This manuscript has been accepted for publication in Science of Computer Programming. The manuscript has undergone copyediting, typesetting, and review of the resulting proof before being published in its final form. Please note that during the production process errors may have been discovered which could affect the content, and all disclaimers that apply to the journal apply to this manuscript.

Informally, a first-past-the-post game is a (probabilistic) game where the winner is the person who predicts the event that occurs first among a set of events. Examples of first-past-the-post games include so-called block and hidden patterns and the Penney-Ante game invented by Walter Penney. We formalise the abstract notion of a first-past-the-post game, and the process of extending a probability distribution on symbols of an alphabet to the plays of a game. We establish a number of properties of such games, for example , the property that an incomplete first-past-the-post game is also a first-past-the-post game. Penney-Ante games are multi-player games characterised by a collection of regular, prefix-free languages. Analysis of such games is facilitated by a collection of simultaneous (non-linear) equations in languages. Essentially, the equations are due to Guibas and Odlyzko. However, they did not formulate them as equations in languages but as equations in generating functions detailing lengths of words. For such games, we show how to use the equations in languages to calculate the probability of winning and how to calculate the expected length of a game for a given outcome. We also exploit the properties of first-past-the-post games to show how to calculate the probability of winning in the course of a play of the game. In this way, we avoid the construction of a deterministic finite-state machine or the use of generating functions, the two methods traditionally used for the task. We observe that Aho and Corasick's generalisation of the Knuth-Morris-Pratt pattern-matching algorithm can be used to construct the deterministic finite-state machine that recognises the language underlying a Penney-Ante game. The two methods of calculating the probabilities and expected values, one based on the finite-state machine and the other based on the non-linear equations in languages, have been implemented and verified to yield the same results.

See here for the paper.

One-person solitaire-like games are explored with a view to using them in teaching algorithmic problem solving. The key to understanding solutions to such games is the identification of invariant properties of polynomial arithmetic. We demonstrate this via three case studies: solitaire itself, tiling problems and a novel class of one-person games.

The known classification of states of the game of (peg) solitaire into 16 equivalence classes is used to introduce the relevance of polynomial arithmetic. Then we give a novel algebraic formulation of the solution to a class of tiling problems. Finally, we introduce an infinite class of challenging one-person games, which we call ``replacement-set games'', inspired by earlier work by Chen and Backhouse on the relation between cyclotomic polynomials and generalisations of the seven-trees-in-one type isomorphism. We present an algorithm to solve arbitrary instances of replacement-set games and we show various ways of constructing infinite (solvable) classes of replacement-set games.

See here for the paper.

We show a new and constructive proof of the two-squares theorem, based on a somewhat unusual, but very effective, way of rewriting the so-called extended Euclid's algorithm. Rather than simply verifying the result -- as it is usually done in the mathematical community --- we use Euclid's algorithm as an interface to investigate which numbers can be written as sums of two positive squares. The precise formulation of the problem as an algorithmic problem is the key, since it allows us to use algorithmic techniques and to avoid guessing. The notion of invariance, in particular, plays a central role in our development: it is used initially to observe that Euclid's algorithm can actually be used to represent a given number as a sum of two positive squares, and then it is used throughout the argument to prove other relevant properties. We also show how the use of program inversion techniques can make mathematical arguments more precise.

See here for the paper.

Algorithms can be used to prove and to discover new theorems. This paper shows how algorithmic skills in general, and the notion of invariance in particular, can be used to derive many results from Euclid's algorithm. We illustrate how to use the algorithm as a verification interface (i.e., how to verify theorems) and as a construction interface (i.e., how to investigate and derive new theorems).

The theorems that we verify are well-known and most of them are included in standard number-theory books. The new results concern distributivity properties of the greatest common divisor and a new algorithm for efficiently enumerating the positive rationals in two different ways. One way is known and is due to Moshe Newman. The second is new and corresponds to a deforestation of the Stern-Brocot tree of rationals. We show that both enumerations stem from the same simple algorithm. In this way, we construct a Stern-Brocot enumeration algorithm with the same time and space complexity as Newman's algorithm. A short review of the original papers by Stern and Brocot is also included.

See here for the paper.

Algorithmic problem solving provides a radically new way of approaching and solving problems in general by using the advances that have been made in the basic principles of correct-by-construction algorithm design. The aim of this thesis is to provide educational material that shows how these advances can be used to support the teaching of mathematics and computing.

We rewrite material on elementary number theory and we show how the focus on the algorithmic content of the theory allows the systematisation of existing proofs and, more importantly, the construction of new knowledge in a practical and elegant way. For example, based on Euclid's algorithm, we derive a new and efficient algorithm to enumerate the positive rational numbers in two different ways, and we develop a new and constructive proof of the two-squares theorem.

Because the teaching of any subject can only be effective if the teacher has access to abundant and sufficiently varied educational material, we also include a catalogue of teaching scenarios. Teaching scenarios are fully worked out solutions to algorithmic problems together with detailed guidelines on the principles captured by the problem, how the problem is tackled, and how it is solved. Most of the scenarios have a recreational flavour i and are designed to promote self-discovery by the students.

Based on the material developed, we are convinced that goal-oriented, calculational algorithmic skills can be used to enrich and reinvigorate the teaching of mathematics and computing.

Puzzles and games have been used for centuries to nurture problem-solving skills. Although often presented as isolated brain-teasers, the desire to know how to win makes games ideal examples for teaching algorithmic problem solving. With this in mind, this paper explores one-person solitaire-like games.

The key to understanding solutions to solitaire-like games is the identification of invariant properties of polynomial arithmetic. We demonstrate this via three case studies: solitaire itself, tiling problems and a collection of novel one-person games. The known classification of states of the game of (peg) solitaire into $16$ equivalence classes is used to introduce the relevance of polynomial arithmetic. Then we give a novel algebraic formulation of the solution to a class of tiling problems. Finally, we introduce an infinite class of challenging one-person games inspired by earlier work by Chen and Backhouse on the relation between cyclotomic polynomials and generalisations of the seven-trees-in-one type isomorphism. We show how to derive algorithms to solve these games.

See here for the paper.

MathIS is a new project that aims to reinvigorate secondary-school mathematics by exploiting insights of the dynamics of algorithmic problem solving. This paper describes the main ideas that underpin the project. In summary, we propose a central role for formal logic, the development of a calculational style of reasoning, the emphasis on the algorithmic nature of mathematics, and the promotion of self-discovery by the students. These ideas are discussed and the case is made, through a number of examples that show the teaching style that we want to introduce, for their relevance in shaping mathematics training for the years to come. In our opinion, the education of software engineers that work effectively with formal methods and mathematical abstractions should start before university and would benefit from the ideas discussed here.

See here for the paper.

Informally, a first-past-the-post game is a (probabilistic) game where the winner is the person who predicts the event that occurs first among a set of events. Examples of first-past-the-post games include so-called block and hidden patterns and the Penney-Ante game invented by Walter Penney. We formalise the abstract notion of a first-past-the-post game, and the process of extending a probability distribution on symbols of an alphabet to the plays of a game.

Analysis of first-past-the-post games depends on a collection of simultaneous (non-linear) equations in languages. Essentially, the equations are due to Guibas and Odlyzko but they did not formulate them as equations in languages but as equations in generating functions detailing lengths of words.

Penney-Ante games are two-player games characterised by a collection of regular, prefix-free languages. For such two-player games, we show how to use the equations in languages to calculate the probability of winning. The formula generalises a formula due to John H.\ Conway for the original Penney-Ante game. At no point in our analysis do we use generating functions. Even so, we are able to calculate probabilities and expected values. Generating functions do appear to become necessary when higher-order cumulatives (for example, the standard deviation) are also required.

See here for the paper.

The ``fan'' of a datatype F is a relation that holds between a value x and an arbitrary F structure in which the only stored value is x. Fans are used to make precise the notion of the shape of a data structure.

For arbitrary datatypes F, G and H, we consider six different ways of composing their fans in order to construct F structures of G structures of H structures; each of the six imposes a different requirement on the shape of the substructures. We catalogue the relation between different combinations of the constructi ons. We apply the result to the shape properties of a natural transformation (aka polymorphic relation) from G structures to FG structures.

The torch problem (also known as the bridge problem or the flashlight problem) is about getting a number of people across a bridge as quickly as possible under certain constraints. Although a very simply stated problem, the solution is surprisingly non-trivial. The case in which there are just four people and the capacity of the bridge is two is a well-known puzzle, widely publicised on the internet. We consider the general problem where the number of people, their individual crossing times and the capacity of the bridge are all input parameters. We present an algorithm that determines the shortest total crossing time; the number of primitive computations executed by the algorithm (i.e. the worst-case time complexity of the algorithm) is proportional to the square of the number of people.

Note added 14/02/2012. The statement of theorem 3 in the published conference paper is incorrect. It has been corrected in this version of the paper. Thanks to Ole Rummel for pointing out the error.

See here for the paper.

Datatype-generic programs are programs that are parameterised by a datatype. We review the allegorical foundations of a methodology of designing datatype-generic programs. The notion of F-reductivity, where F parametrises a datatype, is reviewed and a number of its properties are presented. The properties are used to give concise, effective proofs of termination of a number of datatype-generic programming schemas. The paper concludes with a concise proof of the well-foundedness of a datatype-generic occurs-in relation.

See here for the paper.

We derive an algorithm that enables the rationals to be enumerated in two different ways. One way is known, and is called Calkin-Wilf-Newman enumeration; the second is new and corresponds to a flattening of the Stern-Brocot tree of rationals. We show that both enumerations stem from the same simple algorithm. In this way, we construct a Stern-Brocot enumeration algorithm with the same time and space complexity as Calkin-Wilf-Newman enumeration.

See here for the paper.

Datatype-generic programs are programs that are parameterised by a datatype. Designing datatype-generic programs brings new challenges and new opportunities. We review the allegorical foundations of a methodology of designing datatype-generic programs. The effectiveness of the methodology is demonstrated by an extraordinarily concise proof of the well-foundedness of a datatype-generic occurs-in relation.

See here for the paper.

The Eindhoven quantifier notation is systematic, unlike standard mathematicial notation. This has the major advantage that calculations with quantified expressions become more straightforward because the calculational rules need be given just once for a great variety of different quantifiers.

We demonstrate the ease of calculation with finite quantifications by considering a number of examples. Two are simple warm-up exercises, using boolean equality as the quantifier. Three are taken from books of challenging mathematical problems, and one is a problem concocted by the authors to demonstrate the techniques.

See here for the paper.

Conway (Regular algebra and Finite Machines, Chapman and Hall, 1971) introduced the notion of the factor matrix of a regular language. This document introduces the notion of the factor graph (the transitive reduction of the factor matrix) and shows how it is constructed from the machine, anti-machine and semigroup of a language.

The factors and factor matrix of a language are defined and their properties stated. It is shown that the factor matrix of a language Q has a unique starth root --- called the factor graph of Q. The Knuth, Morris, Pratt pattern-matching algorithm, its extensions and Weiner's substring identifier algorithm are all shown to correspond to finding the factor graph of some regular language.

We develop the theory of two-person impartial games in a calculational style making clear and precise all the logical principles involved.

In an earlier paper, one of the authors presented an algebra for formulating and solving extremal path problems. There are striking similarities between that algebra and the algebra of regular languages, which lead one to consider whether the previous results can be generalized --for instance to path enumeration problems-- and whether the algebra of regular languages can itself be profitably used for the general study of path-finding problems. This paper gives affirmative answers to both these questions.

Many functions on context-free languages can be expressed in the form of theleast fixed point of a function whose definition mimics the grammar of the given language. Examples include the function returning the length of the shortest word in a language, and the function returning the smallest number of edit operations required to transform a given word into a word in a language.

This paper presents the basic theory that explains when a function on a context-free language can be defined in this way. It is shown how the theory can be applied in a methodology for programming the evaluation of such functions.

Specific results include a novel definition of a regular algebra focussing on the existence of so-called ``factors'', and several constructions of non-trivial regular algebras. Several challenging problems are given as examples, some of which are old and some of which are new.

**Erratum** Page 75. Interchange c and d on the right side of the third displayed equation.

See here for the paper.

We use fixed-point calculus to characterise winning strategies in impartial, two-person games. A byproduct is the fixed-point characterisation of winning, losing and stalemate positions. We expect the results to be most useful in teaching calculational reasoning about least and greatest fixed points.

See here for the paper.

Algebraic properties of logical relations on partially ordered sets are studied. It is shown how to construct a logical relation that extends a collection of base Galois connections to a Galois connection of arbitrary higher-order type. ``Theorems-for-free'' is used to show that the construction ensures safe abstract interpretation of parametrically polymorphic functions. The properties are used to show how abstract interpretations of program libraries can be constructed.

Generic programming adds a new dimension to the parametrisation of programs by allowing programs to be dependent on the structure of the data that they manipulate. Apart from the practical advantages of improved productivity that this offers, a major potential advantage is a substantial reduction in the burden of proof -- by making programs more general we can also make them more robust and more reliable. These lectures discuss a theory of datatypes based on the algebra of relations which forms a basis for understanding datatype-generic programs. We review the notion of parametric polymorphism much exploited in conventional functional programming languages and show how it is extended to the higher-order notions of polymorphism relevant to generic programming.

See here for the paper.

Algebraic properties of logical relations on partially ordered sets are studied. It is shown how to construct a logical relation that extends a collection of base Galois connections to a Galois connection of arbitrary higher-order type. ``Theorems-for-free'' is used to show that the construction ensures safe abstract interpretation of parametrically polymorphic functions.

See here for the paper.

The modern world is highly dependent on the reliable functioning of computer software. The sheer scale of software systems makes their design and implementation a highly demanding intellectual activity. Meeting these demands has inspired a revolution in the way that mathematics, the art of effective reasoning, is conducted and presented. Continued effort is needed in education and research in the mathematical construction of programs, based on the controlled manipulation of mathematical formulae.

A discipline for the design of recursive programs is presented. The core concept of the discipline is a new induction principle. The principle is based on a property of relations, called reductivity, that generalises the property of admitting induction to one relative to a given datatype. The principle is used to characterise a broad class of recursive equations that have a unique solution and is also related to standard techniques for proving termination of programs. Methods based on the structure of the given datatype for constructing reductive relations are discussed.

A new implementation of Hoare's Find program is developed. The program determines the Kth smallest element in an (unordered) array of values and how often it occurs.

We argue that Find provides an excellent example of the design of an algorithm that is non-trivial, correct by construction, and yet within the scope of an introductory module on (formal) program design.

Many functions on context-free languages can be expressed in the form of the least fixed point of a function whose definition mimics the grammar of the given language. This paper presents the basic theory that explains when a function on a context-free language can be defined in this way. The contributions are: a novel definition of a regular algebra capturing division properties, several theorems showing how complex regular algebras are built from simpler ones, and the application of fixed point theory and Galois connections to practical programming problems.

See here for the paper.

Dijkstra and Scholten have argued that greater use should be made of the
associativity of equivalence. This note shows how the property is used in specifying the
rotation of the disks in the well-known Towers of Hanoi problem.

Keywords: propositional calculus, program proof.

Fixed point calculus is about the solution of recursive equations defined by a monotonic endofunction on a partially ordered set. This tutorial presents the basic theory of fixed point calculus together with a number of applications of direct relevance to the construction of computer programs. The tutorial also presents the theory and application of Galois connections between partially ordered sets. In particular, the intimate relation between Galois connections and fixed point equations is amply demonstrated.

A shortened version of this paper appears in the published proceedings of the Summer School Algebraic and Coalgebraic Methods in the Mathematics of Program Construction. LNCS 2297, 2002, pp. 89--148. © Springer-Verlag.

Fixed point calculus is about the solution of recursive equations defined by a monotonic endofunction on a partially ordered set. This tutorial discusses applications of fixed point calculus in the construction of computer programs, beginning with standard applications and progressing to recent research.

The basic properties of least and greatest fixed points are presented. Well-foundedness and inductive properties of relations are expressed in terms of fixed points. A class of fixed point equations, called ``hylo'' equations, is introduced. A methodology of recursive program design based on the use of hylo equations is presented. Current research on generalisations of well-foundedness and inductive properties of relations, making these properties relative to a datatype, is introduced.

See here for the paper.

Proceedings of the Summer School Algebraic and Coalgebraic Methods in the Mathematics of Program Construction. LNCS 2297, 2002, pp. 89--148. © Springer-Verlag.

Well-foundedness and inductive properties of relations are expressed in terms of fixed points. A class of fixed point equations, called ``hylo'' equations, is introduced. A methodology of recursive program design based on the use of hylo equations is presented. Current research on generalisations of well-foundedness and inductive properties of relations, making these properties relative to a datatype, is introduced.

See here for the paper.

MathSpad is a document preparation system designed and developed by the authors and oriented towards the calculational construction of programs . PVS (Prototype Verification System) is a theorem checker developed at SRI that has been extensively used for verifying software, in particular in safety-critical applications. This paper describes how these two systems have been combined into one. We discuss the potential benefits of the combination seen from the viewpoint of someone wanting to use formal methods for the construction of computer programs, and we discuss the architecture of the combined system for the benefit of anyone wanting to investigate combining the MathSpad system with other programming tools.

(From the introduction)

Inductive datatypes, datatypes
where elements of the type occur as ``subcomponents'' of other elements of the type, are an
essential feature of all modern programming languages. Commonly used examples of such types are
for example binary trees where, a tree can have other binary trees as subtrees, or cons-lists, where the
tail of a cons-list is another cons-list. A standard mathematical method for reasoning about such
datatypes and programs operating with these types was developed by Malcolm.
He constructed
an elegant generic theory of free inductive datatypes using category theory based on the concepts
of functors and initial algebras. By generic we mean parameterised by the shape of the datatype.

A limitation of this theory is that it only deals with free datatypes, types without rules specifying equality
of elements or restrictions on the construction of elements. In practice there are many common datatypes
that are not free. For example, join-lists have associativity laws on the join operation, and height-balanced
trees can not be constructed using arbitrary subtrees. Fokkinga extended Malcolm's theory
to datatypes with laws, but was not able to handle restrictions on the construction of elements (subtyping).

Other, set-theoretical, theories about inductive datatypes can handle both laws and subtyping but
have as disadvantage that they treat laws and subtyping as dual concepts. This complicates reasoning
about datatypes that combine both laws and subtyping. An example of a type combining both concepts
is the AVL-tree, where different trees can be used to represent the same set of values (law), but where
it is not allowed to join two arbitrary AVL-trees to construct a new valid AVL tree (restriction).

The goal of this thesis is to develop a theory about inductive datatypes that can handle laws and
subtyping in a uniform way. The theory should predict when (recursively defined) operations are
well-defined and when they are uniquely defined. The theory should also provide a sound basis
for the construction and verification of generic programs.

The theory of inductive datatypes presented in this thesis was inspired
by the category-theoretical approach but uses a point-free relational calculus to model both datatypes
and programs. One of the
main advantages of using the relational calculus is that it opens up the possibility of working with
lattices where
extreme solutions to equations are uniquely defined. Category theory always gives solutions
``up to isomorphism'' that are often less suitable for direct manipulation. The extreme solutions of
lattice equations provide unique, canonical representations of the concepts that are being modelled.
Datatypes and programs are usually specified as solutions to equations

Another advantage of the lattice structures that are available when working with relations is the
abundant possibility for using Galois connections. Identifying Galois connections and using their
calculational properties is a recurring theme throughout the thesis. We prefer a calculational
style for constructing and presenting proofs and Galois connections are a great tool for this purpose.

We identify a special class of relations that can be used as representatives for datatypes. These
datatypes are the elements of a complete lattice where the ordering represents (the combination of)
subtyping and quotient formation. Combining these aspects in a single ordering allows us to
find solutions for specifications involving both restrictions (subtyping) and laws (quotients). Combining
these features is often difficult in other formalisms for datatypes. This ordering is a vital tool for
achieving our goal of a uniform treatment of laws and subtyping.

Our datatype construction methods are inspired by categorical datatype theories and we will construct
a category where objects and arrows are relations. Categorical notions like functors, natural transformations
and *F*-algebras lead to relational constructions that are useful for the construction of datatypes and
programs.

A variant of *F*-algebras is used for the introduction of inductive datatypes and structural recursion.
An important aspect of datatype construction is simulation, implementing one datatype using
another datatype. The notion of simulation can easily be formulated in our theory.

Inductive types that simulate each other form equivalence classes. We prove the remarkable result
that every equivalence class contains one special representative.The special representatives form a complete lattice, using our special ordering of datatypes. The elements
of the lattice represent all inductively defined datatypes for a given induction structure. Using this lattice,
we can describe inductive datatypes with both laws and restrictions as
an extreme fixpoint. We will give an equivalent characterization of the extreme fixpoint using a Galois
connection. This Galois connection, which defines a closure operation, turns out to be very convenient
for proving properties of inductive datatypes.

Laws and restrictions can be specified with equations, which can be combined
to a single specification of the datatype. Not only are datatypes described as solutions of equations, but
recursively defined operations on these inductive datatypes are also specified as solutions of equations.
We will show that a large class of ``recursion structure'' equations for operations on inductive datatypes
have at most one solution, so they are suitable as a specification.

Another subject investigated in this thesis is conditions under which parameterisation of inductive datatypes
with laws and restrictions is possible. Here we demonstrate that, if the law and restriction equations satisfy
certain naturality (``polymorphy'') criteria, parameterisation is possible.

The study of inductive and coinductive types (like finite lists and streams, respectively) is usually conducted within the framework of category theory, which to all intents and purposes is a theory of sets and functions between sets. Allegory theory, an extension of category theory due to Freyd, is better suited to modelling relations between sets as opposed to functions between sets. The question thus arises of how to extend the standard categorical results on the existence of final objects in categories (for example, coalgebras and products) to their existence in allegories. The motivation is to streamline current work on generic programming, in which the use of a relational theory rather than a functional theory has proved to be desirable.

In this paper, we define the notion of a relational final dialgebra and prove, for an important class of dialgebras, that a relational final dialgebra exists in an allegory if and only a final dialgebra exists in the underlying category of maps. Instances subsumed by the class we consider include coalgebras and products. An important lemma expresses bisimulations in allegorical terms and proves this equivalent to Aczel and Mendler's categorical definition.

Lecture notes for an advanced course on functional programming focusing
on generic (polytypic) programming. Contents as follows:

Introduction.

Algebras, Functors and Datatypes,

PolyP

Generic Unification

From Functions to Relations

Solutions to Exercises

Bibliography

Glossary of Symbols

Index

Often, theorem checkers like PVS are used to check an existing proof, which is part of some document. Since there is a large difference between the notations used in the documents and the notations used in the theorem checkers, it is usually a laborious task to convert an existing proof into a format which can be checked by a machine. In the system that we propose, the author is assisted in the process of converting an existing proof into the PVS language and having it checked by PVS.

Most studies of Galois
connections begin with a *function* and ask the question: when is there a second function
that is connected to the first? In possibly the very first application of Galois connections
directly related to the modern digital computer, Hartmanis and Stearns posed a subtly
different question, namely: when
does a *relation* define two functions that are Galois connected? Such a relation they
called a ``pair algebra''.
In this paper we derive a general, necessary and
sufficient condition for a relation between complete lattices to define a Galois connection
between the lattices. We also give several examples of pair algebras illustrating why this
seemingly forgotten notion is relevant to the science of computing.

Kropf has collected together a number of problems that can
be used as benchmarks for hardware verification, one of which is a bus arbiter.
The function of the bus arbiter is to assign the use of
a shared resource on each clock cycle to one out of *N* subsystems that may want to use it,
and in such a way
that no subsystem is denied access forever. The most significant component in its
implementation is a round-robin scheduler.
Rather than *verify* the
existing implementation of the scheduler,
this paper shows how to *construct* a correct implementation from the given
requirements. We make use of
both point-free and pointwise relation algebra.

This work is about the calculational construction of circuits. By
``construction'' we mean obtaining a design by means of formal
manipulation of a formal specification. ``Calculational'' refers to the
style which we employ, that is to try to reduce all program derivations to
straightforward calculation.

In other words, in this thesis we aim to develop a style of calculus
that aids the designer in the derivation (and the presentation) of
circuits.

We start from an established notation called *Ruby*, developed by
Sheeran and Jones, based on the algebra of binary relations. We
then develop our own style of Ruby, through a number of derivation
case studies.

The main results in this thesis concerns the derivation of regular
language recognizing circuits. We show how an existing design, due to
Foster and Kung can be derived and explained; we point out a
limitation in Foster and Kung's design and show how it can be
partially overcome.

Other case studies we present are a new derivation of a circuit that solves the
carre problem, and a derivation of a round-robin scheduler.

We then show how our designs can be implemented in practice. First we
show how to simulate the circuits using Hutton's Ruby
simulator; then we go all the way to implement a
compiler from Ruby circuits to a CSP-like notation called Tangram.

Additional tool support is provided by proof checkers. Even though
our method is supposed to be mainly used with pencil and paper, it is
possible and sometimes useful to check the proofs by machine. We use
PVS to verify part of our theory and one of the case studies.

\mplogo{} is a structure editor designed to facilitate the writing of documents of a mathematical nature. The principal element of \mplogo{} is a system for defining and using ``stencils'' which define the visual and logical structure of notational elements in a document.

This paper overviews the main features of the editor from a user's viewpoint. Particular attention is given to a discussion of the priorities that guided the system's design --- (on screen) readability, flexibility, writability and the integration of prose with mathematical calculation.

*Polytypic* programs are programs that are parameterised by type constructors (like
*List*), unlike *polymorphic* programs which are parameterised by types (like *Int*).
In this paper we formulate precisely the polytypic
programming problem of ``commuting'' two datatypes. The precise
formulation involves a novel notion of higher order polymorphism. We demonstrate
via a number of examples the relevance and interest of the problem, and we show that
all ``regular datatypes'' (the sort of datatypes that one can define in a functional
programming language) do indeed commute according to our specification. The
framework we use is the theory of allegories, a combination of category theory with the
point-free relation calculus.

No abstract available

See here for the paper.

We present a derivation of a regular language recognizing circuit originally developed by Foster and Kung. We make use of point-free relation algebra, in a style combining elements from earlier work in the Eindhoven Mathematics of Program Construction group, and from Ruby. First we derive non-systolic recognizers, much in the same way as functional programs are derived. Then we make use of standard circuit transformation techniques, recast in the relation algebra framework, to obtain circuits that are very close to the ones presented by Foster and Kung

The notion of reductivity of a relation with respect to a datatype is introduced and related to inductivity and initiality. The use of reductivity in constructing terminating programs is discussed. A calculus of reductivity, discussed in more detail in a companion paper, is introduced.

According to the ``proofs as programs'' paradigm of constructive type theory, programs can be derived as a by-product of the process of proving statements about type structures. Inspired by this paradigm, we show in this document how many theorems in category theory can be viewed as ``constructive'' generalisations of theorems in lattice theory, the added value of category theory being that the constructions are ``coherent''. Several applications are presented concerning the construction of isomorphisms between data structures as a by-product of proofs of equality of two languages. The basis for these applications is a categorical fixed point calculus generalising fixed point calculus in lattice theory. In addition several (to our knowledge) novel results in category theory are derived.

A brief outline of each chapter follows:

1. Basic Definitions. An introduction to category theory aimed at making the document self-contained and accessible to those with no knowledge of category theory.

2. Illustrations. Two illustrations of the idea of ``category theory as coherently constructive lattice theory''. The second is ``Yoneda's lemma'': we show how to understand this lemma as a generalisation of an elementary characterisation of the notion of monotonic function in lattice theory.

3. Adjunctions. The categorical notion of an adjunction is introduced as the ``coherently constructive'' counterpart of the notion of a Galois connection. Several (to our knowledge) novel results on adjunctions are derived.

4. Algebras. The notion of an algebra is introduced as the categorical counterpart to the notion of a prefix point of a monotonic function. The map functor is introduced.

5. Fixed Point Calculus. Categorical fixed point rules are derived generalising the rules of the fixed point calculus (rolling rule, diagonal rule, fusion rule etc.)

6. Monads. Properties of closure operators are generalised to monads. In particular the monad decomposition theorem is presented.

7. Applications. Applications of the former to the derivation of isomorphims between data structures as the by-product of proofs of equality of languages.

A program derivation is said to be *polytypic* if some of
its parameters are data types.
Polytypic program derivations necessitate a general, non-inductive
definition of `data type'. Here we propose such a definition:
a data type is a relator that has membership. It is shown how
this definition implies various other properties that are shared
by all data types. In particular, all data types have a unique
strength, and all natural transformations between data types are
strong.

This work is a contribution to the unification of the imperative and functional styles of programming. Specifications and data types are modelled as relations, and recursive programs as equations in relations. This modelling gives the possibility to develop a style of programming in which programs are constructed from specifications in a calculational way, using the calculus of relations. The programming style incorporates both the fold-unfold techniques of Burstall and Darlington and the refinement calculus of Back and Morgan.

To be able to prove properties of programs and data types, concise formulations of ``well-foundedness'' and ``admitting induction'' are given. Because the formulations are compact and are given in terms of relations, properties of programs can be proved in a compact, yet precise, way and in a purely calculational style. The power of the developed techniques is demonstrated by proving various non-trivial theorems.

The property ``well-foundedness'' of relations is generalised by explicitly introducing a data type in the definition. The generalised notion is called ``reductivity'': well-founded with respect to a data type. Reductivity is used to formalise the concept of terminating program.

A calculus of reductive relations is developed. This calculus can be used to construct recursive programs that are guaranteed to terminate. Because reductivity mentions the structure of the data on which the program operates, the influence of data structure on program structure can be made precise.

MathSpad is a structure editor designed to facilitate the integration of doing mathematics and writing about it. The principal element of MathSpad is a system for defining and using ``stencils'' which define the visual and logical structure of notational elements in a document. Structure-editing features enable the use of MathSpad to perform symbolic calculations, in non-standard algebras, directly on-screen.

This paper discusses the considerations that influenced the design of MathSpad. Particular attention is given to a discussion of the priorities that guided the system's design. A brief review of the stencil mechanism is presented.

A number of lattice-theoretic fixed point rules are generalised to category theory and applied to the construction of isomorphisms between list structures.

A new induction principle is introduced. The principle is based on a property of relations, called reductivity, that generalises the property of admitting induction to one relative to a given datatype. The principle is used to characterise a broad class of recursive equations that have a unique solution and is also related to standard techniques for proving termination of programs. Methods based on the structure of the given datatype for constructing reductive relations are discussed. The principle is formulated in the point-free calculus of relations guaranteeing concision without loss of precision and thus ensuring its amenability to calculation.

No abstract.

The research areas of database programming languages and formal program development have a lot to offer to each other. In particular, both areas share a great interest in data types -- in more theoretical approaches, both have reached into category theory, with its ``functor=datatype'' motto. So-called bulk- or collection types have been investigated in both research communities.

This paper sketches an approach to calculi for databases which provide multiple data types. The setup has a categorical flavour, but no more than some intuitive understanding of a few basic concepts of category theory is expected of the reader.

The first layer of the calculus consists of operations on basic values that correspond to the traditional tuples of relational databases. These operations are provided by a category with special (viz. associative and semi-commutative) products. The labels of the tuples are provided by the categorical typing.

In the second layer, these operations are lifted to operations on sets, resulting in the traditional relational algebra operators. The collection of operators that allows us to do that turns out to represent an algebra with sets as ``first-class citizens'', the nested relational algebra. The resulting algebra is, if we abstract from sets, one of strong monads with some additional requirements.

The final section discusses the possibilities of combining and interfacing several such monads.

Dijkstra and Scholten have formulated a theorem stating that all disjunctivity properties of a predicate transformer are preserved by the construction of least prefix points. An alternative proof of their theorem is presented based on two fundamental fixed point theorems, the abstraction theorem and the fusion theorem, and the fact that suprema in a lattice are defined by a Galois connection. The abstraction theorem seems to be new; the fusion theorem is known but its importance does not seem to be fully recognised.

The abstraction theorem, the fusion theorem, and Dijkstra and Scholten's theorem are then generalised to the context of category theory and shown to be valid. None of the theorems in this context seems to be known, although specific instances of Dijkstra and Scholten's theorem are known.

The main point of the paper is to discuss the process of drawing inspiration from lattice theory to formulate theorems in category theory (first advocated by Lambek in 1968). We advance the view that, in order to contribute to the development of programming methodology, category theory may be profitably regarded as ``constructive'' lattice theory in which the added value lies in establishing that the constructions are ``coherent''.

This paper was specially prepared for presentation at the meeting of IFIP Working Group 2.3 (Programming Methodology), June 1994. Knowledge of (elementary) lattice theory is assumed. Knowledge of category theory is not.

Several concise formulations of mathematical induction are presented and proven equivalent. The formulations are expressed in variable-free relation algebra and thus are in terms of relations only, without mentioning the related objects. It is shown that the induction principle in this form, when combined with the explicit use of Galois connections, lends itself very well for use in calculational proofs. Two non-trivial examples are presented. The first is a proof of a Newman's lemma. The second is a calculation of a condition under which the union of two well-founded relations is well-founded. In both cases the calculations lead to generalisations of the known results. In the case of the latter example, one lemma generalises three different conditions.

A calculational derivation is given of two abstract path algorithms. The first is an all-pairs algorithm, two well-known instances of which are Warshall's (reachability) algorithm and Floyd's shortest-path algorithm; instances of the second are Dijkstra's shortest-path algorithm and breadth-first/depth-first search of a directed graph. The basis for the derivations is the algebra of regular languages.

A calculational derivation is given of an abstract path algorithm, one instance of the algorithm being Dijkstra's shortest-path algorithm, another being breadth-first/depth-first search of a directed graph. The basis for the derivation is the algebra of regular languages.

See here for the paper.

The ``Boom hierarchy'' is a hierarchy of types that begins at the level of trees and includes lists, bags and sets. This hierarchy forms the basis for the calculus of total functions developed by Bird and Meertens, and which has become known as the ``Bird-Meertens formalism''.

This paper describes a hierarchy of types that logically precedes the Boom hierarchy. We show how the basic operators of the Bird-Meertens formalism (map, reduce and filter) can be introduced in a logical sequence by beginning with a very simple structure and successively refining that structure.

The context of this work is a relational theory of datatypes, rather than a calculus of total functions. Elements of the theory necessary to the later discussion are summarised at the beginning of the paper.

We describe a framework for the derivation of programs for arbitrary (in
particular, parallel) architectures, motivated by a generalization of
the derivation process for sequential algorithms. The central concept
in this approach is that of a *skeleton*: on the one hand, a
higher-order function for targeting transformational derivations at, on the
other hand representing an elementary computation on the architecture
aimed at. Skeletons thus form a basis for intermediate languages, that can be
implemented once and for all, as a process separate from individual program
developments. The available knowledge on the derivation of (higher-order)
functional programs can be used for deriving parallel ones.

In this paper we demonstrate that the basic rules and calculational techniques used in two extensively documented program derivation methods can be expressed, and, indeed, can be generalised within a relational theory of datatypes. The two methods to which we refer are the so-called ``Bird-Meertens formalism'' and the ``Dijkstra-Feijen calculus''.

The current paper forms an abridged, though representative, version of a complete account of the algebraic properties of the Boom hierarchy of types. Missing is an account of extensionality and the so-called cross-product.

See here for the paper.

This paper makes a start on a research programme the goal of which is to reformulate theorems in category theory as constructive theorems in lattice theory. It is argued that monads should be formulated as constructive closure operators. In order to support this thesis it is shown how, by adapting the relevant theorem in lattice theory, every unary relator defines a monad. The minimality of the constructed monad is captured by a monad simulation theorem, which is then used to establish a monad decomposition theorem by adapting proofs of the closure decomposition theorem in lattice theory.

This paper reports ongoing research into a theory of datatypes based on
the calculus of relations. A fundamental concept introduced here is the notion
of ``relator'' which is an adaption of the categorical notion of functor.
Axiomatisations of polynomial relators (that is relators built from the unit type
and the disjoint sum and cartesian product relators) are given, following which
the general class of initial datatypes is studied. Among the topics discussed
are natural polymorphism, junctivity and continuity properties.

The current paper is an incomplete draft and will be supplemented at
later dates.

The zip function is well known to functional programmers.
Informally, zip maps two lists of the same length into a
single list of pairs whereby ( [*a*_{1}, *b*_{1}, ... ] , [*a*_{2},
*b*_{2}, ... ] ) --> [ ( *a*_{1}, *a*_{2}) , (*b*_{1},
*b*_{2}), ... ].
Zip is just one of a whole family of functions that *commute* the
order of two data structures. Whilst zip commutes a pair of lists into a list
of pairs, it is not difficult to imagine generalising zip to a function that
commutes *m* lists each of length *n* into *n* lists each of length *m*.
Indeed, this latter function is also well known under the name *matrix
transposition*. With a little bit more imagination several other members of
the family suggest themselves. For example, there is a function that
commutes a tree of lists all of the same length into a list of trees all of
the same shape. There is also a function that commutes a rose of trees all
of the same shape into a tree of roses all of the same shape.

This paper considers the construction of such functions. We specify the class of so-called ``zippable'' data-structures and consider in detail a large and significant subclass of this class. For this subclass we show how to construct ``zips'' and identify several consequences of their being ``zippable''.

With relational transformational programming in mind, an extension of a ``lawless'' relational theory of datatypes is proposed in order to study and manipulate quotient types within a Tarski-like calculus of relations. The extended notion of type, pertype (from partial equivalence relation), is shown to admit a complete lattice structure by constructing the order via a Galois connection. A pertyping of relations is developed and inductive pertypes generated by equations are discussed. Pertypes do occur in model theory for lambda-calculus but we are unaware of manipulations with inductive ``lawful'' types based on a simple relational calculus.

This paper tackles the problem of constructing a compact, point-free proof of the associativity of demonic composition of binary relations and its distributivity through demonic choice. In order to achieve this goal a definition of demonic composition is proposed in which angelic composition is restricted by means of a so-called ``monotype factor''. Monotype factors are characterised by a Galois connection similar to the Galois connection between composition and factorisation of binary relations. The identification of such a connection is argued to be highly conducive to the desired compactness of calculation.

The spec calculus is a programming algebra that supports and even encourages
the derivation of *relational* programs in a clear and *calculational* style.

After a brief introduction of the spec calculus two particular uses are investigated.
First, the notion of program inversion is introduced. This notion from imperative
programming allows one to construct from a given program *R* that satisfies *{P} R {Q}*
a program *R ^{inv}* that satisfies

The second topic is the derivation of precedence parsing algorithms. A simple algorithm for the calculation of the inorder traversal of a tree is transformed in two steps into an algorithm for precedence parsing. The first transformation is the use of program inversion to obtain an algorithm that constructs a tree from its inorder traversal. The second step is a restriction of the non-determinism of the algorithm. It is shown how the resulting algorithm corresponds to precedence parsing algorithms.

A notion of simulation of one datatype by another is defined as a *constructive*
preorder. A calculus of datatype simulation is then developed by formulating constructive
versions of least-fixed-point theorems in lattice theory. The calculus is applied to
the construction of several isomorphisms between classes of datatypes. In
particular constructive adaptations of theorems in lattice theory about closure operators
are shown to yield simulations and isomorphisms between monad structures, and
constructive adaptations of theorems in regular algebra are shown to yield
isomorphisms between list structures.

The notions of domain operator and domain kind are introduced. Several examples are presented. In particular, it is shown that the partial equivalence relations form a domain kind. The proof involves the construction of a Galois connection demonstrating that the partial equivalence relations form a complete lattice under the so-called domain ordering, thus providing another illustration of the importance of the early recognition of Galois connections.

(No abstract)

The notion of relational catamorphism is used to give a necessary and sufficiend condition for a relation to be a congruence relation on an abstract data type. The condition is shown to be a generalisation of the homomorphism theorem of universal algebra.

In this note we present a proof of a theorem to be found in Ehrig and Mahr [Fundamentals of Algebraic Specification 1. EATCS Monographs on Theoretical Computer Science, 1985]. The theorem states that a relation constructed from a given function is a congruence relation iff the function is a homomorphism; we go on to generalise the result to the relational homomorphisms treated by the first author in [Naturality of homomorphisms. Lecture Notes, International Summer School on Constructive Algorithmics, vol.3, 1989].

This paper reports ongoing research into a theory of datatypes based on the calculus of relations. A fundamental concept introduced here is the notion of ``relator'' which is an adaptation of the categorical notion of functor. Relational catamorphisms are then introduced and shown to satisfy a unique extension property. Several further properties are discussed including so-called fusion properties. The paper is concluded by showing how new relators can be constructed by an appropriate choice of relational catamorphism.

Formal reasoning is notoriously long and arduous; in order to use it to reason effectively in the construction of programs it is, therefore, paramount that we design our notations to be both clear and economical. Taking examples from AI, from imperative programming, from the use of the Bird-Meertens formalism and from category theory we demonstrate how the right choice of what to denote and how it is denoted can make significant improvements to formal calculations. Brief mention is also made of the connection between economical notation and properties of type.

This paper provides a tutorial introduction to a constructive theory of types based on, but incorporating some extensions to, that originally developed by Per Martin-L\"of. The emphasis is on the relevance of the theory to the construction of computer programs and, in particular, on the formal relationship between program and data structure. Topics discussed include the principle of propositions as types, free types, congruence types, types with information loss and mutually recursive types. Several examples of program development within the theory are also discussed in detail.

Two formalisms that have been used extensively in the last few years for the calculation of programs are the Eindhoven quantifier notation and the formalism developed by Bird and Meertens. Although the former has always been applied with ultimate goal the derivation of imperative programs and the latter with ultimate goal the derivation of functional programs there is a remarkable similarity in the formal games that are played. This paper explores the Bird-Meertens formalism by expressing and deriving within it the basic rules applicable in the Eindhoven quantifier notation.

See here for the paper.

In recent years it has become evident that there is a small
core of path-finding algorithms consisting almost
exclusively of algorithms known to numerical analysts
which solve a system of linear equations Ax = b where A is a
real nxn matrix, x is an nx1 vector of variables and b is an
nx1 vector or real constants.

The connection of path-finding problems with the solution of
linear equations has been a fruitful one but it still cannot be
claimed that the connection is completely understood. Some
authors have attempted to find a minimal axiom system for
which the classical techniques will work. An alternative
approach, initiated by Backhouse and Carr\a'{e} [1975] and
further developed by Tarjan [Tech.Rep. STAN-CS-79-729,
April 1979] is to use the algebra of regular expressions,
expressing path problems via homomorphisms which map
all paths through the graph to the given problem domain.

Tarjan relates the problem of solving the linear system of
equations Ax = b in real arithmetic to the problem of finding
\emph{non-redundant} regular expressions. Tarjan's results are
based on a lemma (lemma 2, ``The hardest result in this
paper'') whose proof is incomplete and unnecessarily
hedged with qualifications. The present paper gives a
complete and accurate proof of Tarjan's lemma. Our proof
consists of a reworking of the well-known technique for
establishing whether two regular expressions are equal, but
with the additional handicap of establishing whether they
are equal when viewed as functions of real numbers.

[Note: Subsequent to sending this report to Tarjan, he amended the statement and
proof of lemma 2, removing the qualifications I said were unnecessary and adding
an additional lemma so that the lemma became lemma 3. The revised paper was
published in the Journal of the ACM, vol.\ 28, No.\ 3 (July 1981), pp.\ 577--593. The
revised proof uses an argument about multivariate polynomials that are identical on a
sufficiently small sphere. I have never understood the proof.]

The factors and factor matrix of a regular language are defined and their properties stated. It is then shown that the factor matrix of a language Q has a unique starth root --- called the factor graph of Q --- which is a recogniser of Q. An algorithm to calculate the factor graph is presented. The Knuth, Morris and Pratt pattern matching algorithm, its extensions and Weiner's substring identifier algorithm are outline and are all shown to be equivalent to finding the factor graph of some regular language. Key words and phrases: regular language, factor, factor matrix, pattern matching, string-matching.