Chan Park is a fifth-year PhD candidate at the University of Wisconsin–Madison, advised by Dr. Hyunseung Kang. Before joining UW–Madison, he received a BS in Statistics in 2015 from Seoul National University, Korea. His research interests include developing flexible, nonparametric methods to infer causal effects in dependent and/or clustered data and to show the optimality of these methods. 

Chan Park was one of three winners of the Lawrence D. Brown PhD Student Awards, and will be presenting this lecture in a special session at the 2022 IMS Annual Meeting in London, UK, June 27–30, 2022. 

See https://www.imsannualmeeting-london2022.com/special-sessions for more information about the award session.

 

Assumption-Lean Analysis of Cluster Randomized Trials in Infectious Diseases for Intent-to-Treat Effects and Network Effects

In infectious diseases, cluster randomized trials (CRTs) are a popular experimental design to study the effect of interventions where an entire cluster of individuals, usually households or villages, are randomized to treatment or control. When analyzing data from CRTs in infectious disease settings, investigators primarily use parametric methods, usually a mixed-effect model to adjust for pre-treatment covariates and intra-correlations within clusters, and focus on the overall intent-to-treat (ITT) effect, i.e. the population average effect of the cluster-level intervention on the outcome. While simple, if the parametric models are mis-specified, the results may be misleading. 

Also, individuals may not comply with the cluster-level intervention, potentially inducing meaningful spillover effects. For example, in CRTs of vaccine studies, some may actually not get vaccinated for various reasons (e.g. immunocompromised, severe side effects). However, their vaccinated peers may protect the unvaccinated individuals in the form of herd immunity. In causal inference, this protection is a type of spillover effect. 

The main theme of our work is to propose “assumption-lean” methods to analyze these two types of effects, the ITT effects and the network effects induced from noncompliance. 

To study the ITT effects in an assumption-lean manner, we propose a modest extension of a nonparametric, “regression-esque” method that (i) are invariant to affine transformations of the outcome and (ii) have desirable asymptotic properties even when both the cluster size and the number of clusters are growing. 

For the network effects induced by noncompliance, Kang and Keele (2018) showed that point-identification of these effects is generally infeasible in a CRT without strong assumptions. Instead, we follow an assumption-lean approach where we propose a new method to obtain sharp bounds of these effects. At a high level, our new method combines linear programming (LP) and risk minimization from supervised machine learning (ML) where a trained classifier from risk minimization shrinks the LP bounds for the network effects. Also, compared to existing approaches on bounds under, our bounds (i) use flexible ML classifiers to potentially make the bounds narrower and (ii) irrespective of classifiers’ quality, our bounds will always cover the desired effect, with a good classifier leading to shorter bounds. Practically, this means that investigators can potentially get shorter bounds by not only getting good data from a CRT, but also by choosing better classification algorithms from ML. 

We conclude by reanalyzing a CRT studying the effect of face masks and hand sanitizers on transmission of 2008 inter-pandemic influenza in Hong Kong. We find that the effect of giving free face masks and hand sanitizers was heterogeneous in that the effect was more effective among individuals living in dense households. Moreover, the bound estimate of the reduction in infection rates among individuals who did not use face masks and hand sanitizers from their peers was [5.4%p, 17.3%p], indicating that the protective effect was present in the Hong Kong study. 

This talk is based on joint work with Dr. Hyunseung Kang.