Preview of a Special IMS Lecture
Ashwin Pananjady is a final year PhD student in the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley, advised by Martin Wainwright and Thomas Courtade. His research interests are broadly in statistics, optimization and information theory. Specific topics of interest include ranking and permutation estimation, high-dimensional and non-parametric statistics, high-dimensional probability, and reinforcement learning. His honors include the Governor’s Gold Medal from the Indian Institute of Technology Madras in 2014, and an Outstanding Graduate Student Instructor award from UC Berkeley in 2018. Ashwin will give this talk at the Bernoulli–IMS World Congress in Seoul in August.
Statistics meets computation in non-parametric, permutation-based models
In a variety of applications including ranking from pairwise comparisons, crowd labeling, and assortment optimization, matrix data is represented using transformations of a small number of free parameters. For instance, consider modeling the n × n matrix of pairwise comparison probabilities between n chess players: entry (i, j) of the matrix contains the probability with which player i wins a game against player j. The ELO rating system employed in chess roughly* posits that this probability is given by the Gaussian CDF evaluated at the difference between the ELO ratings of player i and player j. In effect, it posits that the entries of the comparison matrix are completely determined by scalar parameters assigned to each of the n players, i.e., their ELO ratings.
Such a parametric model is simple, interpretable, and amenable to computationally efficient maximum likelihood estimation of the comparison matrix from win/loss data. On the other hand, for many modern applications of ranking from pairwise comparisons where we are interested in modeling human preferences, parametric models suffer from significant mis-specification. Consequently, the classical sociology literature [1] proposed a class of non-parametric, “stochastic transitivity” models for better representing such data, which replaced parametric ratings with shape constraints and permutations.
This gave rise to the class of bivariate isotonic matrices with unknown permutations: matrices in which the rows and columns can be suitably permuted such that the entries of the resulting matrix increase along each row as well as along each column. The comparison probability matrix generated by the ELO rating system clearly takes this form, but the non-parametric, “permutation-based” model class is significantly larger and therefore more robust to mis-specification than parametric models. In addition, and perhaps surprisingly, the minimax rate of matrix estimation over the class of permutation-based models is essentially the same as that over the smaller parametric class; this rate is achieved by the maximum likelihood estimator. Thus, these permutation based models occupy a nice “sweet-spot” on the bias–variance tradeoff.
However, the maximum likelihood estimator over the class of permutation-based models is intractable to compute, and existing computationally efficient estimators produce sub-optimal rates. Consequently, a “statistical–computational” gap was conjectured in a series of papers on this topic.
In this talk, I will introduce an efficient algorithm that uniformly outperforms existing, tractable procedures. It obtains minimax-optimal estimation rates in a certain regime of the problem, and in other regimes, it narrows the statistical–computational gap [2]. Time permitting, I will also touch upon how these estimators can be modified in the presence of structured missing data [3].
—
* The actual system is slightly more complicated, and accounts for drawn games.
References
[1] P. C. Fishburn. Binary choice probabilities: on the varieties of stochastic transitivity. Journal of Mathematical Psychology, 10(4): 327–352, 1973.
[2] C. Mao, A. Pananjady, and M.J. Wainwright. Towards optimal estimation of bivariate isotonic matrices with unknown permutations. Annals of Statistics, to appear, 2019+.
[3] A. Pananjady, C. Mao, V. Muthukumar, M.J. Wainwright, and T.A. Courtade. Worst-case versus average-case design for estimation from partial pairwise comparisons. Annals of Statistics, to appear, 2019+.