Substitution is also pushout

rewriting

Author

Published

2025-08-06

Abstract

How we can think about pushouts as applying rules via substitution, featuring examples in categorical databases and Datalog.

“Substitution is pullback” is a common slogan in categorical logic. The intuition behind it is found in Andrej Bauer’s brief blog post, substitution is pullback.

1 Unrelatedly, substitution is sometimes pushout

We’ll demonstrate this phenomenon by focusing on two categories, \mathsf{Grph} and \mathsf{Dat}, as examples.

1.1 First example category: graphs

The objects of \mathsf{Grph} are directed multigraphs (hereafter: graphs). A graph G consists in a set of vertices V_G and a set of edges E_G, where each edge has designated source and target vertex given by functions {\rm src}_G and {\rm tgt}_G.¹ Morphisms in \mathsf{Grph} are graph homomorphisms.²

The path graph of length two, \boxed{\bullet\rightarrow\bullet\rightarrow\bullet}, can be thought of as a piece of data.
- We can informally put labels on the vertices to help us refer to parts of the graph.³
- E.g. this (very same) graph could also have been drawn as \boxed{\textcircled{\footnotesize 1}\rightarrow\textcircled{\footnotesize 2}\rightarrow\textcircled{\footnotesize 3}}.
However, we can also think of this same graph, \boxed{\textcircled{\footnotesize x}\xrightarrow{a}\textcircled{\footnotesize y}\xrightarrow{b}\textcircled{\footnotesize z}}, as representing a conjunctive query, one which could be applied to other graphs: “Find me an edge (a) and an edge (b) such that a’s target equals b’s source.”
- In this case, the labels are like variable names.
- In this post, we mark this attitudinal distinction by labeling graphs we are thinking as queries with letters. Graphs thought of as data will be labeled with numbers.

Above are two attitudes one can take towards a graph. An attitude one can take towards a morphism from a query X to data Y is that of a pattern match, i.e. an answer to the query X. If we take a different attitude to this morphism X\rightarrow Y, thinking of it as between two different queries, we would call the morphism a rewrite rule. An example rewrite rule is {\boxed{\textcircled{\footnotesize x}}\rightarrow\boxed{\textcircled{\footnotesize x}\circlearrowleft}}. This rule is saying “if you give me a vertex, x, I can turn it into a vertex with a loop.”⁴

Applying this rewrite rule morphism to some graph G:=\boxed{\textcircled{\footnotesize 1}\rightarrow\textcircled{\footnotesize 2}\rightarrow\textcircled{\footnotesize 3}}, requires choosing whether x\mapsto 1 or x\mapsto 2 or x\mapsto 3. This is precisely the data of a match morphism from {\boxed{\textcircled{\footnotesize x}}} into G. Suppose we take the map which represents x\mapsto 2: taking the pushout has the effect of substituting 2 for x, yielding {\boxed{\textcircled{\footnotesize 2}}\rightarrow\boxed{\textcircled{\footnotesize 2}\circlearrowleft}}. There is no such loop in G, but applying the rewrite rule makes the minimal change to G required to make the answer true.

Let’s look at one more example: the rule “if you gave me an edge, I could give you path of length two which has the same source and target”. Applying this rule requires a match morphism assigning two variable vertices and one variable edge:

Note because this rule didn’t bind anything to z, it’s not obvious what label we should informally give the newly introduced vertex. If we applied the rule multiple times, we would continue to generate a new vertex (and two new edges) each time.

1.2 Second example category: Datalog

Let’s attempt to understand the logic programming language Datalog categorically. The key concepts here are relation, fact, constant, variable, rule, and query. For example one can write some facts about a binary edge relation and constants v1, v2, and v3.

edge(v1,v1). edge(v3,v3). edge(v1,v2). edge(v3,v2).

When someone writes a Datalog program, there is a starting set of ground facts (which are facts about constants). There are also rules, which have two parts, known as the head and the body (written HEAD :- BODY.⁵). We can think of the body as the preconditions of the rule and the head as the postcondition(s).⁶ An example of a rule (which involves both variables and constants) is:

edge(v1,X) :- edge(X,X), edge(X,v2).

This says that: “if you give me any vertex X with a self loop and edge into v2, I’ll give you an edge from v1 into X”. If we ran the Datalog program with this rule and the ground facts above, we would derive the fact that edge(v1,v3) because, if one queries the preconditions, one can get an answer X=v3 which then gets plugged into the postcondition. One can imagine making these moves until there are no further changes in the set of facts: this is called the Herbrand model of the Datalog program. The sets of facts which we encounter along the way to the Herbrand model are called Herbrand interpretations (hereafter: interpretations).

We want to think of the initial collection of ground facts as an interpretation, an update that brings us closer to the Herbrand model as a morphism of interpretations, the rule as a morphism preconditions -> postconditions of interpretations, and the finding of a set of terms that satisfies the preconditions as a morphism of interpretations preconditions -> current_interpretation. Furthermore, we want this update morphism to shake out as the result of taking the pushout in this category. Let’s craft a definition to make this happen!

\mathsf{Dat} definition (concrete)

Fix some signature \Sigma: R \rightarrow \mathbb{N} assigning arities to a set of relation symbols. Let K be some set of constants. We define \mathsf{Dat}_{\Sigma, K} (hereafter just \Sigma and K will be implicit) to be the category of interpretations and interpretation morphisms, where:

An interpretation I consists of a set I_V of ‘variables’, as well as relations on K+I_V for every relation symbol r \in R of arity \Sigma(r).
A morphism of interpretations I\rightarrow J is a function I_V \rightarrow K + J_V, assigning variables to either constants or variables,⁷ that preserves whatever relations are in I.

So to interpret the example rule above as a morphism I\rightarrow J, we must take I to be edge(X,X), edge(X,v2) and J to be edge(v1,X), edge(X,X), edge(X,v2). Note that, because these rules tell us how to add facts (never removing them), we should always think of the preconditions as implicitly included in the postconditions. We have {K=\{v_1,v_2,v_3\}} and {I_V=J_V=\{X\}} with the morphism data the function sending X\mapsto X, which preserves the relations in I because J extends the set of facts of I.

It will be easier to show that this category has pushouts when defined in a more abstract way.

\mathsf{Dat} definition (abstract)

Given a signature \Sigma we define the category \mathsf{C}_\Sigma to contain an object X as well as morphisms r_1,...,r_{\Sigma(r)}: A_r\rightarrow X for each r \in R. There is an associated finite limit sketch T_\Sigma which adds limit cones asserting that its models M: C_\Sigma \rightarrow \mathsf{Set} have, for each r \in R, that M(r_1),...,M(r_{\Sigma(r)}) are jointly monic.⁸

We define \mathsf{Dat}_\Sigma to be the the category of models of T_\Sigma, which is subcategory of \mathsf{Set^{C_\Sigma}}. As a category of limit sketch models, \mathsf{Dat}_\Sigma is a reflective subcategory whose reflector quotients each M(A_r), merging together elements that agree on all values of M(r_1), ..., M(r_{\Sigma(r)}). Reflective subcategories of cocomplete categories are cocomplete, with colimits computed by first including into the larger category, performing the colimit there, and then applying the reflector to coerce the result back into the subcategory.

Constants are handled via coslicing: let \mathsf{Dat}_{\Sigma,K} be {\rm Const}/\mathsf{Dat}_\Sigma, where {\rm Const} is the functor with {\rm Const}(X):=K and {\rm Const}(A_r)=\varnothing for all r \in R. Note that coslices of categories with pushouts have pushouts, computed in the base category.

Consider interpretations that have just a binary edge relation. These are pretty similar to graphs as defined in the previous section; however, there is no notion of multiple edges. Also the vertices can have fixed, stable identities: although the graph \boxed{\textcircled{\footnotesize a}\rightarrow\textcircled{\footnotesize b}} is isomorphic / not meaningfully different from \boxed{\textcircled{\footnotesize b}\rightarrow\textcircled{\footnotesize a}} because the labels are not meaningful, an interpretation whose only fact is edge(v1,v2) will not be isomorphic in \mathsf{Dat} to one whose only fact is edge(v2,v1), if v1 and v2 are constants.

To see these pushouts in action, consider applying the rule above, using the depicted match morphism (i.e. answer X=v3 to the query of the rule’s preconditions). Note we distinguish variables from constants by using red for variables:

Note that X=v1 was also a valid answer to the query. However, in \mathsf{Dat}, the pushout does nothing, rather than adding another loop to v1:

Each answer to the query of a Datalog rule’s preconditions is a potential way to ground the rule, which can potentially add a new fact and bring one’s interpretation closer to the Datalog program’s Herbrand model.

A common extension, Datalog^\exists, allows for the heads of rules to introduce new terms. For example, the rule “if you gave me an edge, I could give you path of length two which has the same source and target” is expressed as:

edge(X,Z), edge(Z,Y) :- edge(X,Y). # note "Z" not in body of rule!

This is actually the natural interpretation of \mathsf{Dat}, though we could restrict to a subcategory which has no ability to introduce terms: to do this we would require all rule morphisms to have their underlying function I_V\rightarrow J_V+K be of the form {I_V \overset{\iota}\rightarrowtail I_V+K}.

2 Takeaway

We considered categories where objects play dual roles of being queries as well as data. In such categories, it is natural to think of elements of {\rm Hom}(X,Y) as answers to the query of shape X in data of shape Y. We can also think of an element of {\rm Hom}(X,Y) as a query of shape Y that depends on results to a query of shape X. Taking an answer to X and using it to construct a canonical answer to Y is a kind of substitution (or grounding), which can be characterized as a pushout.

3 Special thanks to Minnowbrook logic programming seminar

Making explicit this connection between graph rewrite rules and datalog rules was one of many thoughts which came about talking to Michael Arntzenius and others while hiking in Adirondacks with a great group of logic programmers. This workshop was organized by Kristopher Micinski in May 2025.

Footnotes

This is precisely the data of a functor {{\rm SchGraph}\rightarrow\mathsf{Set}}, where {{\rm SchGraph}:=\boxed{{\rm Edge}\rightrightarrows {\rm Vertex}}}. Morphisms are natural transformations of these functors. Everything in this section applies to \mathsf{C}-sets generally.↩︎
A graph homomorphism h:G\rightarrow H is a pair of functions, h_V: V_G\rightarrow V_H and h_E: E_G\rightarrow E_H such that h_V\circ s_G=s_H\circ h_E and h_E\circ t_G=t_H\circ h_E.↩︎
It’s possible to have multiple edges between the same pair of vertices, but the graphs appearing in this post will happen to not have such parallel edges. Hence we will usually avoid labeling edges.↩︎
I.e. if you give me an answer to the domain query of the morphism, I can give you an answer to the codomain query of the morphism.↩︎
The :- is meant to be an arrow of sorts pointing to the left.↩︎
Standard Datalog syntax has a single fact as the postcondition of a rule, but some Datalog interfaces allow providing multiple facts as the postconditions for some set of preconditions. This could be desugared to multiple rules (one for each postcondition) which all have the same preconditions.↩︎
Implicitly it also assigns all constants to themselves, which is what allows us to compose such functions and get a category of such morphisms.↩︎
Consider the arity 2 case: r_1,r_2: A_r\rightarrow X are jointly monic if and only if A_r is the limit of the diagram: .
Let’s make sense of this diagram as a constraint on models of the sketch by thinking of it as a limit diagram in \mathsf{Set}. The limit is set of pairs of elements of (x,y) \in M(A_r), such that M(r_1)(x) = M(r_1)(y) and M(r_2)(x)=M(r_2)(y) are simply the pairs (a, a) for all a \in M(A_r). Of course, these pairs will always be a solution to the equation described by the limit, but M(r_1) and M(r_2) being jointly monic means there will be no other solutions.↩︎