Dylan Small received his PhD in Statistics in 2002 from Stanford University with Tze Leung Lai as his thesis advisor. Dylan is the Universal Furniture Professor of Statistics and Data Science in the Wharton School of the University of Pennsylvania and is currently the chair of the Department of Statistics and Data Science. His research focuses on causal inference and applications of statistics to public health and public policy. He was the founding editor of the journal Observational Studies. Dylan has advised 28 PhD students on their dissertations, and has mentored several undergraduates and postdoctoral fellows on research. 

Dylan Small’s Medallion Lecture will be given at JSM2022, August 6–11, 2022, in Washington DC.


Protocols for Observational Studies: Methods and Open Problems

For learning about the causal effect of a treatment, a randomized controlled trial (RCT) is considered the gold standard. However, randomizing treatment is sometimes unethical or infeasible, and instead an observational study may be conducted. Three sources1 of the strength of a properly designed RCT are: 

(1) randomization – by randomly assigning people to treatment or control, an RCT creates a fair comparison between the two groups; 

(2) identical processes – a properly designed RCT takes pains to apply to the treatment and control group all other processes in the same way such as adjuvant therapies, follow-up and outcome assessment; and 

(3) protocol – a properly designed RCT is driven by a strong protocol with pre-specified hypotheses about pre-specified outcomes and a pre-specified analysis. 

The first source of strength, randomization, is unique to an RCT. But the second and third sources of strength have nothing to do with randomization and can be made part of a well-designed observational study. This talk is about making the third source, a protocol, a part of an observational study. We will illustrate the value of protocols for observational studies in three applications – the effect of playing high school on later life mental functioning, the effect of police seizing a gun when arresting a domestic violence suspect on future domestic violence and the effect of mountaintop mining on health – and discuss methodologies for observational study protocols and open problems. 

Investigators may be concerned that writing down too strict a protocol will limit their ability to learn from the data through exploratory data analysis and to make study design choices that maximize power. The power of a study depends on features of the study design, including how much a test statistic emphasizes tail observations, what to do about outliers, how different subgroups are incorporated into the analysis and how different outcomes are incorporated into the analysis. The best choices for these features often depend on aspects of the population about which one is uncertain before obtaining data. Ideally one would like to adapt the study design to the data but if there is no protocol which pre-specifies the analysis, the researcher’s biases (conscious or subconscious) can influence the results. As the psychologist Fred Emery said, “Instead of constantly adapting to change, why not change to be adaptive?” An adaptive protocol seeks to allow for adapting to the data while protecting against researcher bias. We will discuss three strategies for designing adaptive protocols: 

(i) specifying up front the multiple test statistics, subgroups, outcomes etc. that will be considered and using a multiple testing procedure that controls an appropriate error rate (e.g., family-wise Type I error rate or false discovery rate); 

(ii) splitting the sample, using part of the sample to make design choices and the other part of the sample for analysis using these design choices; and 

(iii) looking at secondary aspects of the data to make design choices. 

While many of the considerations for designing a protocol for an RCT and an observational study are the same, there are differences. We will discuss some of these differences, such as how power comparisons between various multiple testing procedures differs between RCTs and observational studies.


[1] Moses, L. E. (1995). Measuring effects without randomized trials? Options, problems, challenges. Medical Care, AS8–AS14.