Identifying the latent space geometry of network models through analysis of curvature
When and Where
Speakers
Description
Modeling statistical and economic networks is fundamentally challenging because of (often high-order) dependence between connections. A common approach assigns each person in the graph to a position on a low-dimensional manifold. Distance between individuals in this (latent) space is inversely proportional to the likelihood of forming a connection. The choice of the latent geometry (the manifold class, dimension, and curvature) has consequential impacts on the substantive conclusions of the model. More curvature in the manifold, for example, encourages more and tighter communities. Currently, however, the choice of the latent geometry is an a priori modeling assumption and there is limited guidance about how to make these choices in a data-driven way. In this work, we present a method to consistently estimate the manifold type, dimension, and curvature from an empirically relevant class of latent spaces, simply connected, complete Riemannian manifolds. Our core insight comes by representing the graph as a noisy distance matrix based on the ties between cliques. Leveraging results from statistical geometry, we develop hypothesis tests to determine whether the observed distances could plausibly be embedded isometrically in each of the candidate geometries. We explore the accuracy of our approach with simulations and then apply our approach to data-sets from economics and sociology as well as neuroscience. This is joint work with Shane Lubold (UW) and Arun Chandrasekhar (Stanford).
About Tyler H. McCormick
Professor McCormick is an Assistant Professor of Statistics and Sociology at the University of Washington, where he is also a core faculty member in the Center for Statistics and the Social Sciences and a Senior Data Science Fellow of the eScience Institute. His research develops statistical models that leverage social network structure to understand the nuances of human behaviour. He also does methodological work in Bayesian statistics and is particularly interested in models that provide interpretable representations of data with complex dependence structure.