Introduction

I am a research data scientist in the Advanced Research Computing Centre at University College London. Prior to my current role, I did postdocs with Chris Oates at the School of Mathematics, Statistics and Physics at Newcastle University and Alex Thiery at the Department of Statistics and Applied Probability in the National University of Singapore. I completed my PhD at the School of Informatics in the University of Edinburgh, where I was supervised by Amos Storkey.

Research interests: Markov chain Monte Carlo methods, Hamiltonian Monte Carlo, data assimilation, inverse problems, probabilistic programming.

Contact

Address
UCL Advanced Research Computing Centre, Bidborough House, London, WC1H 9BT.

Publications

Pre-prints

  • 2023/07 Parameter inference for degenerate diffusion processes

    Yuga Iguchi, Alexandros Beskos and Matthew M. Graham

    We study parametric inference for ergodic diffusion processes with a degenerate diffusion matrix. Existing research focuses on a particular class of hypo-elliptic SDEs, with components split into `rough'/`smooth' and noise from rough components propagating directly onto smooth ones, but some critical model classes arising in applications have yet to be explored. We aim to cover this gap, thus analyse the highly degenerate class of SDEs, where components split into further sub-groups. Such models include e.g. the notable case of generalised Langevin equations. We propose a tailored time-discretisation scheme and provide asymptotic results supporting our scheme in the context of high-frequency, full observations. The proposed discretisation scheme is applicable in much more general data regimes and is shown to overcome biases via simulation studies also in the practical case when only a smooth component is observed. Joint consideration of our study for highly degenerate SDEs and existing research provides a general `recipe' for the development of time-discretisation schemes to be used within statistical methods for general classes of hypo-elliptic SDEs.
  • 2019/06 A scalable optimal-transport based local particle filter

    Matthew M. Graham and Alexandre H. Thiery

    Filtering in spatially-extended dynamical systems is a challenging problem with significant practical applications such as numerical weather prediction. Particle filters allow asymptotically consistent inference but require infeasibly large ensemble sizes for accurate estimates in complex spatial models. Localisation approaches, which perform local state updates by exploiting low dependence between variables at distant points, have been suggested as a potential resolution to this issue. Naively applying the resampling step of the particle filter locally however produces implausible spatially discontinuous states. The ensemble transform particle filter replaces resampling with an optimal-transport map and can be localised by computing maps for every spatial mesh node. The resulting local ensemble transport particle filter is however computationally intensive for dense meshes. We propose a new optimal-transport based local particle filter which computes a fixed number of maps independent of the mesh resolution and interpolates these maps across space, reducing the computation required and allowing it to be ensured particles remain spatially smooth. We numerically illustrate that, at a reduced computational cost, we are able to achieve the same accuracy as the local ensemble transport particle filter, and retain its improved robustness to non-Gaussianity and ability to quantify uncertainty when compared to local ensemble Kalman filters.

Journal articles

  • (in-press) Parameter estimation with increased precision for elliptic and hypo-elliptic diffusions

    Yuga Iguchi, Alexandros Beskos and Matthew M. Graham

    Bernoulli

    This work aims at making a comprehensive contribution in the general area of parametric inference for discretely observed diffusion processes. Established approaches for likelihood-based estimation invoke a time-discretisation scheme for the approximation of the intractable transition dynamics of the Stochastic Differential Equation (SDE) model over finite time periods. The scheme is applied for a step-size that is either user-selected or determined by the data. Recent research has highlighted the critical ef-fect of the choice of numerical scheme on the behaviour of derived parameter estimates in the setting of hypo-elliptic SDEs. In brief, in our work, first, we develop two weak second order sampling schemes (to cover both hypo-elliptic and elliptic SDEs) and produce a small time expansion for the density of the schemes to form a proxy for the true intractable SDE transition density. Then, we establish a collection of analytic results for likelihood-based parameter estimates obtained via the formed proxies, thus providing a theoretical framework that showcases advantages from the use of the developed methodology for SDE calibration. We present numerical results from carrying out classical or Bayesian inference, for both elliptic and hypo-elliptic SDEs.
  • 2024/04 ParticleDA.jl v.1.0: a distributed particle-filtering data assimilation package

    Daniel Giles, Matthew M. Graham, Mosè Giordano, Tuomas Koskela, Alexandros Beskos and Serge Guillas

    Geoscientific Model Development

    Digital twins of physical and human systems informed by real-time data are becoming ubiquitous across weather forecasting, disaster preparedness, and urban planning, but researchers lack the tools to run these models effectively and efficiently, limiting progress. One of the current challenges is to assimilate observations in highly non-linear dynamical systems, as the practical need is often to detect abrupt changes. We have developed a software platform to improve the use of real-time data in non-linear system representations where non-Gaussianity limits the applicability of data assimilation algorithms such as the ensemble Kalman filter and variational methods. Particle-filter-based data assimilation algorithms have been implemented within a user-friendly open-source software platform in Julia – ParticleDA.jl. To ensure the applicability of the developed platform in realistic scenarios, emphasis has been placed on numerical efficiency and scalability on high-performance computing systems. Furthermore, the platform has been developed to be forward-model agnostic, ensuring that it is applicable to a wide range of modelling settings, for instance unstructured and non-uniform meshes in the spatial domain or even state spaces that are not spatially organized. Applications to tsunami and numerical weather prediction demonstrate the computational benefits and ease of using the high-level Julia interface with the package to perform filtering in a variety of complex models.
  • 2023/04 Manifold lifting: scaling Markov chain Monte Carlo to the vanishing noise regime

    Khai Xiang Au, Matthew M. Graham and Alexandre H. Thiery

    Journal of the Royal Statistical Society: Series B (Statistical Methodology)

    Standard Markov chain Monte Carlo methods struggle to explore distributions that concentrate in the neighbourhood of low-dimensional submanifolds. This pathology naturally occurs in Bayesian inference settings when there is a high signal-to-noise ratio in the observational data but the model is inherently over-parametrised or nonidentifiable. In this paper, we propose a strategy that transforms the original sampling problem into the task of exploring a distribution supported on a manifold embedded in a higher-dimensional space; in contrast to the original posterior this lifted distribution remains diffuse in the limit of vanishing observation noise. We employ a constrained Hamiltonian Monte Carlo method, which exploits the geometry of this lifted distribution, to perform efficient approximate inference. We demonstrate in numerical experiments that, contrarily to competing approaches, the sampling efficiency of our proposed methodology does not degenerate as the target distribution to be explored concentrates near low-dimensional submanifolds.
  • 2022/08 Testing whether a learning procedure is calibrated

    Jon Cockayne, Matthew M. Graham, Chris Oates, Tim J. Sullivan and Onyur Teymur

    Journal of Machine Learning Research

    A learning procedure takes as input a dataset and performs inference for the parameters θ of a model that is assumed to have given rise to the dataset. Here we consider learning procedures whose output is a probability distribution, representing uncertainty about θ after seeing the dataset. Bayesian inference is a prime example of such a procedure, but one can also construct other learning procedures that return distributional output. This paper studies conditions for a learning procedure to be considered calibrated, in the sense that the true data-generating parameters are plausible as samples from its distributional output. A learning procedure whose inferences and predictions are systematically over- or under-confident will fail to be calibrated. On the other hand, a learning procedure that is calibrated need not be statistically efficient. A hypothesis-testing framework is developed in order to assess, using simulation, whether a learning procedure is calibrated. Several vignettes are presented to illustrate different aspects of the framework
  • 2022/04 Manifold Markov chain Monte Carlo methods for Bayesian inference in diffusion models

    Matthew M. Graham, Alexandre H. Thiery and Alexandros Beskos

    Journal of the Royal Statistical Society: Series B (Statistical Methodology)

    Bayesian inference for nonlinear diffusions, observed at discrete times, is a challenging task that has prompted the development of a number of algorithms, mainly within the computational statistics community. We propose a new direction, and accompanying methodology—borrowing ideas from statistical physics and computational chemistry—for inferring the posterior distribution of latent diffusion paths and model parameters, given observations of the process. Joint configurations of the underlying process noise and of parameters, mapping onto diffusion paths consistent with observations, form an implicitly defined manifold. Then, by making use of a constrained Hamiltonian Monte Carlo algorithm on the embedded manifold, we are able to perform computationally efficient inference for a class of discretely observed diffusion models. Critically, in contrast with other approaches proposed in the literature, our methodology is highly automated, requiring minimal user intervention and applying alike in a range of settings, including: elliptic or hypo-elliptic systems; observations with or without noise; linear or non-linear observation operators. Exploiting Markovianity, we propose a variant of the method with complexity that scales linearly in the resolution of path discretisation and the number of observation times. Python code reproducing the results is available at http://doi.org/10.5281/zenodo.5796148.
  • 2017/12 Asymptotically exact inference in differentiable generative models

    Matthew M. Graham and Amos J. Storkey

    Electronic Journal of Statistics

    Many generative models can be expressed as a differentiable function applied to input variables sampled from a known probability distribution. This framework includes both the generative component of learned parametric models such as variational autoencoders and generative adversarial networks, and also procedurally defined simulator models which involve only differentiable operations. Though the distribution on the input variables to such models is known, often the distribution on the output variables is only implicitly defined. We present a method for performing efficient Markov chain Monte Carlo inference in such models when conditioning on observations of the model output. For some models this offers an asymptotically exact inference method where approximate Bayesian computation might otherwise be employed. We use the intuition that computing conditional expectations is equivalent to integrating over a density defined on the manifold corresponding to the set of inputs consistent with the observed outputs. This motivates the use of a constrained variant of Hamiltonian Monte Carlo which leverages the smooth geometry of the manifold to move between inputs exactly consistent with observations. We validate the method by performing inference experiments in a diverse set of models.

Conference proceedings

  • 2021/04 Measure transport with kernel Stein discrepancy

    Matthew A. Fisher, Tui Nolan, Matthew M. Graham, Dennis Prangle and Chris Oates

    Proceedings of the 24th International Conference on Artificial Intelligence and Statistics

    Measure transport underpins several recent algorithms for posterior approximation in the Bayesian context, wherein a transport map is sought to minimise the Kullback-Leibler divergence (KLD) from the posterior to the approximation. The KLD is a strong mode of convergence, requiring absolute continuity of measures and placing restrictions on which transport maps can be permitted. Here we propose to minimise a kernel Stein discrepancy (KSD) instead, requiring only that the set of transport maps is dense in an L2 sense and demonstrating how this condition can be validated. The consistency of the associated posterior approximation is established and empirical results suggest that KSD is competitive and more flexible alternative to KLD for measure transport.
  • 2017/08 Continuously tempered Hamiltonian Monte Carlo

    Matthew M. Graham and Amos J. Storkey

    Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence

    Hamiltonian Monte Carlo (HMC) is a powerful Markov chain Monte Carlo (MCMC) method for performing approximate inference in complex probabilistic models of continuous variables. In common with many MCMC methods, however, the standard HMC approach performs poorly in distributions with multiple isolated modes. We present a method for augmenting the Hamiltonian system with an extra continuous temperature control variable which allows the dynamic to bridge between sampling a complex target distribution and a simpler unimodal base distribution. This augmentation both helps improve mixing in multimodal targets and allows the normalisation constant of the target distribution to be estimated. The method is simple to implement within existing HMC code, requiring only a standard leapfrog integrator. We demonstrate experimentally that the method is competitive with annealed importance sampling and simulating tempering methods at sampling from challenging multimodal distributions and estimating their normalising constants.
  • 2017/04 Asymptotically exact inference in differentiable generative models

    Matthew M. Graham and Amos J. Storkey

    Proceedings of the 20th International Conference on Artificial Intelligence and Statistics

    Many generative models can be expressed as a differentiable function of random inputs drawn from some simple probability density. This framework includes both deep generative architectures such as Variational Autoencoders and a large class of procedurally defined simulator models. We present a method for performing efficient MCMC inference in such models when conditioning on observations of the model output. For some models this offers an asymptotically exact inference method where Approximate Bayesian Computation might otherwise be employed. We use the intuition that inference corresponds to integrating a density across the manifold corresponding to the set of inputs consistent with the observed outputs. This motivates the use of a constrained variant of Hamiltonian Monte Carlo which leverages the smooth geometry of the manifold to coherently move between inputs exactly consistent with observations. We validate the method by performing inference tasks in a diverse set of models.
  • 2016/05 Pseudo-Marginal Slice Sampling

    Iain Murray and Matthew M. Graham

    Proceedings of the 19th International Conference on Artificial Intelligence and Statistics

    Markov chain Monte Carlo (MCMC) methods asymptotically sample from complex probability distributions. The pseudo-marginal MCMC framework only requires an unbiased estimator of the unnormalized probability distribution function to construct a Markov chain. However, the resulting chains are harder to tune to a target distribution than conventional MCMC, and the types of updates available are limited. We describe a general way to clamp and update the random numbers used in a pseudo-marginal method's unbiased estimator. In this framework we can use slice sampling and other adaptive methods. We obtain more robust Markov chains, which often mix more quickly.

Workshop papers

  • 2017/08 Inference in differentiable generative models

    Matthew M. Graham and Amos J. Storkey

    ICML 2017 workshop: Implicit generative models

    Many generative models can be expressed as a differentiable function of random inputs drawn from a known probability distribution. This framework includes both learnt parametric generative models and a large class of procedurally defined simulator models. We present a method for performing efficient Markov chain Monte Carlo (MCMC) inference in such models when conditioning on observations of the model output. For some models this offers an asymptotically exact inference method where Approximate Bayesian Computation might otherwise be employed. We use the intuition that inference corresponds to integrating a density across the manifold corresponding to the set of inputs consistent with the observed outputs. This motivates the use of a constrained variant of Hamiltonian Monte Carlo which leverages the smooth geometry of the manifold to move between inputs exactly consistent with observations.
  • 2016/12 Continuously tempered Hamiltonian Monte Carlo

    Matthew M. Graham and Amos J. Storkey

    NIPS 2016 workshop: Advances in Approximate Bayesian Inference

    Hamiltonian Monte Carlo (HMC) is a powerful Markov chain Monte Carlo (MCMC) method for performing approximate inference in complex probabilistic models of continuous variables. In common with many MCMC methods however the standard HMC approach performs poorly in distributions with multiple isolated modes. Based on an approach proposed in the statistical physics literature, we present a method for augmenting the Hamiltonian system with an extra continuous temperature control variable which allows the dynamic to bridge between sampling a complex target distribution and a simpler uni-modal base distribution. This augmentation both helps increase mode-hopping in multi-modal targets and allows the normalisation constant of the target distribution to be estimated. The method is simple to implement within existing HMC code, requiring only a standard leapfrog integrator. It produces MCMC samples from the target distribution which can be used to directly estimate expectations without any importance re-weighting.

Theses and dissertations

  • 2018/07 Auxiliary variable Markov chain Monte Carlo methods

    Matthew M. Graham

    PhD thesis, University of Edinburgh

    Markov chain Monte Carlo (MCMC) methods are a widely applicable class of algorithms for estimating integrals in statistical inference problems. A common approach in MCMC methods is to introduce additional auxiliary variables into the Markov chain state and perform transitions in the joint space of target and auxiliary variables. In this thesis we consider novel methods for using auxiliary variables within MCMC methods to allow approximate inference in otherwise intractable models and to improve sampling performance in models exhibiting challenging properties such as multimodality. We first consider the pseudo-marginal framework. This extends the Metropolis–Hastings algorithm to cases where we only have access to an unbiased estimator of the density of target distribution. The resulting chains can sometimes show ‘sticking’ behaviour where long series of proposed updates are rejected. Further the algorithms can be difficult to tune and it is not immediately clear how to generalise the approach to alternative transition operators. We show that if the auxiliary variables used in the density estimator are included in the chain state it is possible to use new transition operators such as those based on slice-sampling algorithms within a pseudo-marginal setting. This auxiliary pseudo-marginal approach leads to easier to tune methods and is often able to improve sampling efficiency over existing approaches. As a second contribution we consider inference in probabilistic models defined via a generative process with the probability density of the outputs of this process only implicitly defined. The approximate Bayesian computation (ABC) framework allows inference in such models when conditioning on the values of observed model variables by making the approximation that generated observed variables are ‘close’ rather than exactly equal to observed data. Although making the inference problem more tractable, the approximation error introduced in ABC methods can be difficult to quantify and standard algorithms tend to perform poorly when conditioning on high dimensional observations. This often requires further approximation by reducing the observations to lower dimensional summary statistics. We show how including all of the random variables used in generating model outputs as auxiliary variables in a Markov chain state can allow the use of more efficient and robust MCMC methods such as slice sampling and Hamiltonian Monte Carlo (HMC) within an ABC framework. In some cases this can allow inference when conditioning on the full set of observed values when standard ABC methods require reduction to lower dimensional summaries for tractability. Further we introduce a novel constrained HMC method for performing inference in a restricted class of differentiable generative models which allows conditioning the generated observed variables to be arbitrarily close to observed data while maintaining computational tractability. As a final topicwe consider the use of an auxiliary temperature variable in MCMC methods to improve exploration of multimodal target densities and allow estimation of normalising constants. Existing approaches such as simulated tempering and annealed importance sampling use temperature variables which take on only a discrete set of values. The performance of these methods can be sensitive to the number and spacing of the temperature values used, and the discrete nature of the temperature variable prevents the use of gradient-based methods such as HMC to update the temperature alongside the target variables. We introduce new MCMC methods which instead use a continuous temperature variable. This both removes the need to tune the choice of discrete temperature values and allows the temperature variable to be updated jointly with the target variables within a HMC method.
  • 2013/07 Insect olfactory landmark navigation

    Matthew M. Graham

    MSc by Research dissertation, University of Edinburgh

    The natural world is full of chemical signals - organisms of all scales and taxonomic classifications transmit and receive chemical signals to guide the full gamut of life’s processes: from helping forming mother-infant bonds, to identifying potential mates and even signalling their own deaths. Insects are particularly reliant on chemical cues to guide their behaviour and understanding how insects respond to and use chemical cues in their environment is a high active research area. In a series of recent studies Steck et al. produced evidence that foragers of the Saharan desert ant species Cataglyphis fortis are able to learn an association between an array of odour sources arranged around the entrance to their nest and the relative location of the nest entrance and later use the information they receive from the odour sources to help them navigate to the visually inconspicious nest entrance. This ability to use odour sources as olfactory landmarks had not been previously seen experimentally in insects, and is a remarkable behaviour given the extremely complex and highly dynamic nature of the olfactory signals received by the ants from the turbulent odour plumes the chemicals travel in from the sources. After an introductory chapter covering some relevant background theory to the work in this project, the second chapter of this dissertation will detail a field study conducted with the European desert ant species Cataglyphis velox. As in the studies of Steck et al. the ants were constrained to moving a linear channel and so the navigation task limited to being one-dimensional, the aim of this study was to see if there was any evidence supporting the hypothesis that Cataglyphis velox ants are able to use olfactory landmarks to navigate in a more realistic open environment. The results of the study were inconclusive, due to the low sample sizes that were collected and small effect size in the study design used, however it is proposed that the study could be considered usefully as pilot for a full study at a later date, and an adjusted study design is proposed that might overcome a lot of the issues encountered in the current study. In the third and final chapter of this dissertation, a modelling study of what information is available in the olfactory signal received from a turbulent odour plume about the location of the source of that plume is presented, with this work aiming to explore the information which may being used by Cataglyphis desert ants when using olfactory landmarks to navigate. The details of the plume and olfactory sensor models used are described and the results of an analysis of the estimated mutual information between the modelled olfactory signals and the location of odour source presented. It is found that the locational informational content of individual signal segment statistics seems to be low, though combining multiple statistics does potentially allow more useful reductions in uncertainty.
  • 2012/06 Measuring tissue stiffness with ultrasound

    Matthew M. Graham

    MEng project report, University of Cambridge

Talks