Xiao-Li Meng writes:
Boston’s reputation of being a hub of universities was elevated recently by the inaugural HUBweek (Hospital, University and Business), which kicked off with a forum led by Michael Sandel, the “rockstar moralist.” Amid an array of thought-provoking questions, Sandel asked if the audience would feel comfortable letting a smart machine, i.e., “a very, very good app”, trained on a large corpus of student work, to grade essays. To panelist Yo-Yo Ma, this idea is as uncomfortable as relying on “an app for parenting.” Using music teaching as an analogy, Ma explained, “The path from one note to the next is going to be different for every single human being on this planet,” because the way the second note joins the first depends on the player’s physical mechanism, neuromuscular structure, etc. Displaying his trademark passion (but without the cello this time), Ma continued: “If you have an app, I don’t care how big the data is and how great your algorithms are, it’s finite. The idea of the human spirit actually getting to something that is beyond the finite is a part of every human being, and we want to look for that in every student…” (The original remark is at about 1:30:00 in https://www.youtube.com/watch?v=urcSDiQwaNQ; and check out Conan O’Brien’s hilarious answers at 1:24:30!)
Ma’s remark touched upon two fundamental questions of possibilities, or perhaps impossibilities. The obvious one is whether a machine can ever make judgments, or more generally think, like a human. Evidently Ma’s answer would be a “no” because human judgments and emotions are too rich to be replicated fully by any “finite” machine. Indeed, machines are generally perceived as being mechanical, useful for repetitive tasks, but not for adaptive ones. The term “machine learning” (ML) therefore is unfortunate, because much of its promise builds upon the computer’s ability to process and abstract information collected from vastly many individuals and sources. Thus a smart machine like the grading app is meant to serve as a “mass brain” or “meta brain.” In that sense, it would be more apt to denote ML as abbreviating “Massive Learning” or “Meta Learning.”
This brings up the second, subtler question: Can we fully learn about an individual from studying many others? Personalized treatment sounds heavenly, but where on earth can anyone find enough (any?) guinea pigs that are exactly like me to make the promise evidence-based? Similar questions about “transition to similar” have been pondered by philosophers from Galen to Hume. But their contemporary realization injects a healthy dose of skepticism to the modern-day pursuit of fully individualized prediction and inference. Nevertheless, the availability of Big Data, aided by ever-growing computing power, is moving us increasingly close to that ideal, albeit never attainable goal (as Ma correctly emphasized).
The Holy Grail of this individualized learning of course is a balancing act: matching on more individualized attributes in constructing a proxy learning population for me increases relevance (lower bias) but decreases robustness (higher variance) due to smaller data size, but matching on fewer attributes trades lower variance for higher bias. However, such dilemmas provide excellent foundational research opportunities, especially for young talents, as detailed in “A Trio of Inference Problems That Could Win You a Nobel Prize in Statistics (If You Help Fund It)” (Meng 2014, http://www.stat.harvard.edu/Faculty_Content/meng/COPSS_50.pdf) and “There is Individualized Treatment. Why Not Individualized Inference?”(Liu & Meng, 2015, http://arxiv.org/abs/1510.08539).
The self-reference might make you think that I take myself too seriously. So let me lighten the mood by describing an amazing coincidence. While working on this XL-File on a flight, I noticed that a couple of flight attendants were very excited at spotting a passenger. The photo below should help you to conduct an individualized inference about the coincidence, or rather to infer who the individual was…
Comments on “XL-Files: Yo-Yo Ma on Machine (or Massive) Learning”