Institute of Mathematical Statistics | IMS members’ work on COVID-19

May 17, 2020

You may have seen our call-out to IMS members, in the e-Bulletin email, or on our Facebook or Twitter, asking if you have been working on COVID-19? Below, in alphabetical order, you can read about the members who wrote to share what their research has focused on. If you’d like a mention in the next issue, please contact ims@imstat.org (ideally, send a paragraph about your work, and a link to the paper, or location where interested readers can find out more). Our next deadline is July 1, 2020.

Xiaohui Chen, Associate Professor of Statistics, University of Illinois at Urbana-Champaign

Xiaohui writes, “I recently published an article on government policy interventions on the COVID-19 transmission. This paper introduces a dynamic panel SIR model to investigate the impact of non-pharmaceutical interventions (NPIs) on the COVID-19 transmission dynamics with panel data from nine countries across the globe. By constructing scenarios with different combinations of NPIs, our empirical findings suggest that countries may avoid the lockdown policy with imposing school closure, mask wearing and centralized quarantine to reach similar outcomes on controlling COVID-19 infection. Our results also suggest that, as of April 4th, 2020, certain countries such as the U.S. and Singapore may require additional measures of NPIs in order to control disease transmissions more effectively, while other countries may cautiously consider to gradually lift some NPIs to mitigate the costs to the overall economy.” See https://arxiv.org/abs/2004.04529 and https://cepr.org/sites/default/files/news/CovidEconomics7.pdf .

Ezra Gayawan, Head of the Biostatistics and Spatial Statistics Research Group at the Federal University of Technology in Akure, Nigeria

Ezra says, “I have led some other colleagues to work on spatio-temporal dynamics of COVID-19 spread in Africa in the first 62 days of its arrival on the continent. We used a hurdle Poisson model within the framework on distributional regression to examine the frequencies and no occurrence of the virus in Africa. We also look at the relationship with available physicians and number of bed spaces in each country in the continent.”

The preprint version of the article can be found at: https://www.medrxiv.org/content/10.1101/2020.04.21.20074435v1

Nicholas P. Jewell, London School of Hygiene and Tropical Medicine and the University of California, Berkeley

Nick Jewell and Britta Jewell (of Imperial College London) wrote an article in the New York Times, “The Huge Cost of Waiting to Contain the Pandemic” on April 14, 2020. https://www.nytimes.com/2020/04/14/opinion/covid-social-distancing.html

Xihong Lin, Professor of Statistics at Harvard University and of Biostatistics at Harvard SPH

In a March 20, 2020 YouTube presentation “Learning from 26,000 cases of COVID-19 in Wuhan” as part of the Broad Institute’s Infectious Disease & Microbiome Program Meeting, Xihong Lin presents her recent research analyzing the lab-confirmed COVID-19 cases in Wuhan (up to February 18). The findings could provide timely information on strategy development on controlling the outbreak in US and other countries.

Video: https://www.youtube.com/watch?v=aQ9KIO1eXTA

Bani K. Mallick, Department of Statistics, Texas A&M University

Bani and his co-authors have written a research paper on COVID-19: a complete statistical model-based approach to predict the COVID infection curve with uncertainties, and to predict the time of flattening of the curve. He says, “In this paper, we propose a Bayesian hierarchical model that integrates global data to estimate COVID-19 infection trajectories. Due to information borrowing across multiple countries, the proposed growth curve models provide a powerful predictive tool endowed with uncertainty quantification. They outperform the existing individual country-based models. Additionally, we use countrywide covariates to adjust infection trajectories. A joint variable selection technique has been integrated into the proposed modeling scheme, which aimed to identify the possible country-level risk factors for severe disease due to COVID-19.”

Preprint: https://www.medrxiv.org/content/10.1101/2020.04.23.20077065v1.full.pdf

David S. Matteson, Professor of Statistics and Data Science, Cornell University

David and his co-authors have written a paper, “Social Distancing Has Merely Stabilized COVID-19 in the US”. David says, “There is a dramatic (albeit delayed) change in the COVID-19 infection data associated with social distancing, but not as dramatic as everyone had hoped for.” Read the pre-print article at https://www.medrxiv.org/content/10.1101/2020.04.27.20081836v1

Karl Pazdernik, Senior Data Scientist in Applied Statistics & Computational Modeling at Pacific Northwest National Laboratory, and Research Assistant Professor at North Carolina State University

Dr. Lauren Charles (PI) along with IMS member, Dr. Karl Pazdernik, and other data scientists and engineers at Pacific Northwest National Laboratory (A. Tuor, S. Dixon, A. Barker, D. Farber, E. Saxon, D. Stevens) continue to push the edge of research with their text analytics pipeline for Biofeeds, a tool used by the Department of Homeland Security to identify and track COVID-19. Biofeeds enables machine-assisted biological and chemical surveillance of any potential threat to humans, animals, or the environment. The tool automatically collects open source text data from more than 800,000 sources, published in over 90 languages around the globe. Powered by natural language processing, named entity recognition, machine learning, and human-in-the-loop training, Biofeeds is able to filter out irrelevant articles and provide a relevance ranking, machine-tags of location, disease, chemical, control measures, impacts, and red flags, identify active events, provide access to similar articles, and calculate an overall significance to the article. A NiFi processing pipeline contains a variety of natural language processing methods, including, but not limited to, time-weighted penalized logistic regression models, recursive regex, binary bag of words models, and recurrent neural network models. The analytic development continues as PNNL expands methods to utilizing transformer deep learning classifiers as well as expanding capabilities to identify anomalous new events and abnormal characteristics of ongoing events. See more about Biofeeds at https://www.nextgov.com/emerging-tech/2020/04/how-homeland-securitys-biosurveillance-arm-uses-tech-track-pandemic/164585/

Śaunak Sen, Professor and Chief of Biostatistics, The University of Tennessee Health Science Center, Memphis

Śaunak says, “We have developed penalized regression models for incorporating longitudinal social distancing measure information into SEIR models. Our work was sparked by an request from the City of Memphis Fire Department emergency response group. They wanted a subcommittee to evaluate the fast-evolving situation and make recommendations to city and county decision-makers. The group we are working with includes a faculty member from my university (UTHSC), a faculty member from the University of Memphis, a freelance epidemiologist and an official from the Fire Department. None of us have prior experience working with infectious diseases and this work needed to be done on an extremely tight deadline. I am proud that we were able to put our collective experience in the service of our community.” This Github repository (with code in Julia) has an outline of their work: https://github.com/senresearch/DiseaseOutbreak.jl

Nozer D. Singpurwalla, Department of Statistics, The George Washington University

Nozer wrote a paper, “The Di-negentropy of Diagnostic and Detection Tests” that was published in 2018 in a publication of Kyoto University’s Research Institute of Mathematical Sciences at http://hdl.handle.net/2433/242115. A revised and more complete work on this topic is currently in progress. He says, “Whereas the matter of testing large members of the population for COVID-19 has become one of predominant discussion, the question of test reliability has been appearing in the major national newspapers on an almost daily basis. Of concern are false negatives and their cascading consequences of poor preparation. There are reports that test inaccuracy of certain devices can be as high as 30%. The matter is certainly one of probability and statistics. Comparing the efficacy of two diagnostic tests when their ROC’s (Receiver Operating Characteristic Curves) cross has been an open statistical problem for quite some time. Based on a previous effort supported by the US NSF on Threat Detection, a procedure based on the information theoretic notion of “di-negentropy” has been developed, and its effectiveness has been empirically demonstrated.”

Anuj Srivastava, Distinguished Research Professor, Florida State University

Anuj’s paper, “Agent-Level Pandemic Simulation (ALPS) for Analyzing Effects of Lockdown Measures,” models the spread of the pandemic using an agent-level model. The main goal of the ALPS simulation is analyze effects of preventive measures—imposition and lifting of lockdown norms—on the rates of infections, fatalities and recoveries.

https://arxiv.org/abs/2004.12250

Harvey Stein, Head of Quantitative Risk Analytics at Bloomberg, and Adjunct Professor of Mathematics at Columbia University

Harvey has been working on estimating COVID-19 infection rates in New York City. He details this in his blog post on the subject: https://hjstein.blogspot.com/2020/04/covid-19-nyc-stats-not-what-they-seem.html. The code for the analysis (updated with the latest data) is available in Harvey’s fork of the NYC Coronavirus data Github repository: https://github.com/hjstein/coronavirus-data

Stanislav Volkov, Professor of Mathematical Statistics and Deputy Head of Division, Centre for Mathematical Sciences, Lund University

Stanislav (Stas) Volkov experimented with trying to find how to model the total death count as a result of COVID-19 in a few countries, without directly implementing SIR or SEIR model, but rather searching for and ad hoc model for how this total mortality develops with time. He says, “It seems to perform fairly well for a number of countries. It may be used to predict when the peak of the decease is over, and perhaps how much medical capacity one would need in the future.”

You can view the paper at https://imstat.org/wp-content/uploads/2020/05/Volkov_COVID_paper.pdf

Lily Wang, Associate Professor in Statistics, Iowa State University

Lily and her co-authors have developed a novel spatiotemporal epidemic model (STEM, Wang, et al. 2020) for infection count and death count to study the spatial-temporal pattern in the spread of COVID-19 at the county level. The proposed methodology can be used to dissect the spatial structure and dynamics of spread, as well as to assess how this outbreak may unfold through time and space in the future. Based on our research findings, a dashboard (https://covid19.stat.iastate.edu/) was established on March 27, 2020, with multiple R shiny apps embedded to provide real-time 7-day forecast of COVID-19 infection count and death count at the county level with risk analysis, as well as a long-term projection in the next four months. Details in the paper at: http://arxiv.org/abs/2004.14103

Qingyuan Zhao, University Lecturer of Statistics at University of Cambridge

Qingyuan says, “I would like to share some work we have done on the selection bias of some early statistical analyses of COVID-19. We developed a generative model we call BETS for four key epidemiological events—Beginning of exposure, End of exposure, time of Transmission, and time of Symptom onset (BETS)—and derived explicit formulas to correct for the sample selection. We gave a detailed illustration of why some early and highly influential analyses of the COVID-19 pandemic were severely biased. All our analyses, regardless of which subsample and model were being used, point to an epidemic doubling time of 2 to 2.5 days during the early outbreak in Wuhan. A Bayesian nonparametric analysis further suggests that about 5% of the symptomatic cases may not develop symptoms within 14 days of infection and that men may be much more likely than women to develop symptoms within 2 days of infection.” https://arxiv.org/abs/2004.07743

Bin Yu, University of California, Berkeley

Bin Yu and her team at Berkeley featured in an April 2, 2020 article in Berkeley Engineering, “Getting the right equipment to the right people”: https://engineering.berkeley.edu/news/2020/04/getting-the-right-equipment-to-the-right-people/