Chunming Zhang is a Professor in the Department of Statistics at the University of Wisconsin-Madison. She earned her Ph.D. in Statistics from the University of North Carolina at Chapel Hill in 2000. Her research interests encompass statistical methods in computational neuroscience, biostatistics, and financial econometrics, along with the analysis of neuroimaging, spatial, and temporal data. Additionally, her work delves into multiple hypotheses testing, large-scale simultaneous inference, dimension reduction, high-dimensional inference, non-parametric and semi-parametric modeling and inference, functional and longitudinal data analysis, and robust statistics. She is an elected Fellow of the Institute of Mathematical Statistics and the American Statistical Association. Dr. Zhang serves on the editorial boards of Annals of Statistics and the Journal of the American Statistical Association. In 2016, she delivered the Keynote Address at the Third Center for Information and Neural Networks Conference on Neural Mechanisms of Decision Making in Japan. This Medallion Lecture will be presented at the 11th World Congress in Probability and Statistics in Bochum, Germany, August 12-16, 2024.
Learning Network-Structured Dependence from Multidimensional Temporal Point Processes
In this lecture, I will present recent work on developing interpretable generative models tailored for temporally non-stationary and spatially dependent sequences of ‘event data’ across various domains. Real-world examples include (a) recordings of spike firing from multiple neurons in the brain, and (b) emerging infectious disease incidents like the COVID-19 pandemic, with cases reported globally across different countries and regions over time. The primary objective of the study is to develop predictive models, either short-term or long-term, to capture dynamic patterns of event occurrence likelihood in real-time and uncover ‘network-structured’ dependencies across geographic locations to identify potential security risks, quantify uncertainties, and guide regional resource allocation.
More broadly, a probabilistic approach for capturing event occurrence likelihood over time and space can leverage the ‘multi-dimensional temporal point process’. This process refers to random occurrences of a specific type of event (e.g., contagious disease incidents) over time, represented as sequences of time points recorded at multiple nodes (locations). While the linear self-exciting Hawkes process remains prevalent for modeling the ‘conditional intensity function,’ its reliance on a non-negative triggering function limits its capacity to exclusively capture excitatory effects among nodes. Additionally, current methodologies face challenges stemming from computational constraints and a lack of probabilistic insights. These limitations include the incapacity to integrate structural constraints or network features, such as the crucial acyclicity constraint for recovering the acyclic causal structure, as well as the absence of external covariates.
We develop new continuous-time stochastic models of conditional intensity functions, dependent on event history of parent nodes, to uncover the network structure within an array of non-stationary multivariate counting processes. The stochastic mechanism is crucial for statistical inference of graph parameters relevant to structure recovery but does not satisfy the key assumptions of commonly used processes like the Poisson process, Cox process, Hawkes process, queuing model, and piecewise deterministic Markov process. We introduce a new marked point process for intensity discontinuities, derive compact representations of their conditional distributions, and demonstrate the cyclicity property of the multivariate counting process, driven by recurrence time points. These new theoretical properties enable us to establish statistical consistency and convergence properties of the proposed estimators for graph parameters under mild regularity conditions. Simulation evaluations demonstrate computational simplicity and increased estimation accuracy compared to existing methods. Real multiple neuron spike train recordings are analyzed to infer connectivity in neuronal networks.