The three functions of a name: reference, identity and display
This is a brief little post about naming. Naming is well-known to be one of the hardest problems in computer science. But why is this the case? I claim that one of the reasons is because often naming systems are being asked to do triple-duty; a single name is used to perform the three functions of reference, identity, and display. What does this mean?
Consider a group of people. Each person has a name. However, these names may not be unique. If I want to refer to a person, I might say something like “the John who most recently took a trip to Japan”. But this is not the only way to refer to this person; there are innumerable ways to pick out an individual. As another example, in an unambiguous context, I might simply say “John”. Or I might have a nickname for him, “Johnny”.
However, how do we figure out if two references point to the same person? One way would be to physically check by following the reference to the people, but if you want to do this on a computer, this doesn’t work. Another way to do it would be to have a way of resolving references to social security numbers (assuming the people are US citizens). Social security numbers identify people in the sense that there is an injection from people to social security numbers.
Finally, if I have a list of social security numbers and I want to make them mnemonic in some way, I might put each person’s full name next to the social security number. Or I might put each person’s full name and a little bio. This is the function of display. Just like references, displays are context-sensitive; I might want to show different information in different contexts. One way of making a display is to invert a certain class of references. For instance, you could display for each person all of their nicknames. But this is not the only purpose of a display; a display often functions to give some useful information about a person. In the olden days, a person’s surname might advertise their profession, like “smith”.
In practice, we might have displays which one would not typically think of in terms of “naming,” like showing documentation for a function in a programming language. But names are often used for the purpose of display, so it is worth thinking about under the context of naming.
We now give a variety of examples of various computer systems and how they distinguish between these functions.
Example: Lexical scoping
In a programming language with lexical scoping, context information is used to disambiguate names, so that a
refers to the most recently bound variable in scope with name a
. Names are not sufficient to resolve identity though. Consider the following dependent function:
f : (a : Int) -> (let b = a in Eq Int a b)
f a = Refl
This will typecheck, because a
and b
are definitionally equal, even though they are refered to by different names. Note that this is different from a
and b
coincidentally having the same value; they are actually the same variable.
The question of display for variables is a tricky one. Consider the program
f : {a : Type} -> a -> {a : Type} -> a -> a
f x y = x
(As is the convention in Agda, we use {...}
for implicit arguments.) This does not type check, because x
has type the first a
and y
has type the second a
. A naive implementation of error messages would print out error: x has type a when type a was expected
, which is of course horribly useless. A better error message would be something like error: x has type a#1 when type a#0 was expected
, where a#0
refers to the most recently bound a
and a#1
refers to the second most recently bound a
.
Example: Julia packages
In the Julia package manager, packages can be referred to by:
- Their name as registered in the Julia package registry
- Their github repo
- Their path on a local computer
However, each package is identified by a UUID. And packages always are displayed by their name.
Example: Rust package manager
The Rust package manager, like the Julia package manager, can refer to other packages by their name in the cargo repository, or by github repo, or local path, etc.
However, it identifies them by their sha256 sum. That is, a package is identified by precisely its content.
There is not one single way to handle the functions of reference, identity, and display. But any system for naming things usually perform all three functions, and realizing that these functions can be handled by distinct means opens up the design space to better designs, including designs in which there are even multiple display modes or multiple reference modes.
For a related, but distinct trilemma involving names, see Zooko’s triangle, and also see petnames, a solution to the triangle which exploits the fact that one can use a different system for reference than identity.