Programs as Singularities: differentiable computation and the geometry of Bayesian model selection
Abstract
I will describe a setting in which the internal structure of an algorithm becomes visible to Bayesian statistics. Using differentiable linear logic and its vector-space semantics, ordinary Turing-machine codes can be embedded into a smooth space of noisy codes, where perturbations correspond to read errors on the description tape of a universal Turing machine. This turns a program into a point in the parameter space of a statistical model, whose local geometry records how its output probabilities change under perturbation. I will explain how singular learning theory enters through local Bayesian evidence: when comparing neighbourhoods of candidate implementations, the asymptotics are controlled by the local learning coefficient. The main theorem identifies Taylor coefficients of the induced error/loss germ with weighted counts of error syndromes in the underlying computation. As a consequence, runtime error correction forces high-order flatness of the associated statistical singularity and yields bounds on the local learning coefficient. I will close by discussing this as an example of structural Bayesianism: the idea that Bayesian inference can be sensitive not only to what a model predicts, but also to how those predictions are algorithmically implemented.Note: this talk will take place at 5pm, not the usual 12pm.