Institute of Mathematical Statistics | Challenges of AI systems: A new kind of principal

Challenges of AI systems: A new kind of principal–agent problem

December 15, 2021

Contributing Editor David J. Hand, Imperial College, London, continues his Hand Writing column:

The phrase Artificial Intelligence, AI for short, is becoming ubiquitous. AI is said to hold the promise of revolutionising the human condition in innumerable ways, from taking over boring and dangerous tasks, through improving medical treatments and cures, to leading to scientific breakthroughs. Indeed, AI is expected to change the world for our benefit in ways we might not even be able to imagine—look at science fiction writing. While driverless cars, medical scanners, and speech recognition were predicted, the internet, world wide web, and social media were generally not.

However, this potential will be realised only if people trust the AI systems to perform as intended. That raises the question of how we can be confident that a system will perform as desired—how can we validate and verify a system’s performance?

This high-level question has several aspects.

The starting point might be phrased as, “Do you know what you are doing?” More formally, have you formulated clearly enough what you are trying to find out or what you are trying to do? Questions with ambiguity are asking for trouble. Unfortunately, adequate question formulation is not always straightforward. A mapping from the booming, buzzing confusion of the real world to a nicely defined description in terms of mathematics or a programming language can lead to simplifications which misrepresent the real objective. Recall the game-playing system which, when given the main objective that it should avoid losing, killed itself at level 1 since it learned that by doing this it could never lose in level 2.

This need to specify the objective function so that it matches sufficiently what you really want can be non-trivial. It is something I have explored in depth in the context of supervised classification systems, where one might try to optimise error rate, area under the ROC curve, the F-measure, the KS statistic, the H-measure, or a host of other criteria—all of which may have different optimal solutions. Indeed, I have examples where different choices can lead to orthogonal decision surfaces. Choose the wrong criterion and you will be about as wrong as it’s possible to be.

Once you are confident that you know what question you really want to answer, or what operation you want the system to carry out, you then need to ask what information or data the system will have in order to answer the question or carry out the operation. This raises all sorts of question about data adequacy and data quality—questions about timeliness, accuracy, relevance, coherence, completeness, and so on.

For dynamic real-time systems one has to ask how they will respond to changing and unexpected circumstances. The Y2K concern was an illustration of this, but at least we knew Y2K was coming. People did not foresee the 2008 financial crisis or the 2020 pandemic. (Okay, some people did foresee these and try to warn governments, but you know how much notice was taken of those people.) So, what is going to be the next big crisis? How well will a system—perhaps constructed in relatively benign economic conditions, or using relatively homogeneous data—behave when the world changes? Obviously, flexibility needs to be built in—but not too much flexibility or the system will flip-flop all over the place.

Changing circumstances mean that systems will need to be updated. In the past that has meant humans rewriting code. But it’s again almost part of the definition of an AI system that it is adaptable; it can change the way it behaves in response to a changing environment and new challenges. But another way of putting that is that it departs from its initial specs. That can be good, but it carries obvious risks.

AI systems will work in the context of a human environment. Indeed, in a large part that might be taken as a basic property of such systems—I started by noting the promise for improving the human condition. So one has to ask how well the systems and humans will work together. We have plenty of examples of cases where the collaboration has not been smooth—think of the autopilot and the human pilot wrestling for control of the Boeing 737 Max, and the two crashes that resulted. Human ignorance of the limitations of AI systems plays a big role here: the readers of this article may have the technical expertise to appreciate these limitations, but most people will not.

The human context is one thing, but increasingly AI systems will also work in an AI environment. Think of the imminent Internet-of-Things, where AI communicates with AI—looking at the food in your fridge, ordering replacements from a supplier, paying from your bank account, with delivery by autonomous vehicle. Concern about correlated financial trading systems, all working within an environment of other systems like themselves, has led to worry about a runaway financial crash.

We also need to exercise caution about using off-the-shelf algorithms. The point is that you cannot see inside them, so you have to ask whether they have mistakes, or simply choices you would not have made, built into them. Are their default choices what you really want? Think of the supervised classification performance criteria, for example, or the recent failure to download records of 16,000 confirmed cases of COVID-19 because of lack of awareness of file size limitations in early versions of Excel.

And, after all those higher-level questions, to have confidence in our AI systems, we have to ask: is the code correct? Formal methods of demonstrating correctness may be applied in some areas, but whether this is possible in all areas when interactions with humans are concerned is something else.

I’ve described some of the challenges of validating AI systems, some of the issues that need to be considered when building them and using them in real applications. My description might have reminded you of something similar, from long before the computer came on the scene. Pre-computer, indeed pre-science, the way to do amazing things was by magic (recall Arthur C. Clarke: “Any sufficiently advanced technology is indistinguishable from magic”). Clearly, the issues I have described resemble the risks of the Sorcerer’s Apprentice. But I prefer to think of it another way. Economists have a concept called the Principal–Agent Problem. This is where you (the principal) employ someone (the agent) to perform certain tasks for you, but where their aims might not be precisely aligned with yours. In that case there is a risk that they might optimise for their objectives rather than yours, and that could damage you. I think, in AI systems, we are facing a new kind of principal–agent problem.