Reports  |  ,   |  November 15, 2019

Explainable AI: the Basics – Policy Briefing

Report produced by The Royal Society. 32 pages.


Recent years have seen significant advances in the capabilities of Artificial Intelligence (AI) technologies. Many people now interact with AI-enabled systems on a daily basis: in image recognition systems, such as those used to tag photos on social media; in voice recognition systems, such as those used by virtual personal assistants; and in recommender systems, such as those used by online retailers.

As AI technologies become embedded in decision-making processes, there has been discussion in research and policy communities about the extent to which individuals developing AI, or subject to an AI-enabled decision, are able to understand how the resulting decision-making system works.

Some of today’s AI tools are able to produce highly-accurate results, but are also highly complex. These so-called ‘black box’ models can be too complicated for even expert users to fully understand. As these systems are deployed at scale, researchers and policymakers are questioning whether accuracy at a specific task outweighs other criteria that are important in decision-making systems. Policy debates across the world increasingly see calls for some form of AI explainability, as part of efforts to embed ethical principles into the design and deployment of AI-enabled systems. This briefing therefore sets out to summarise some of the issues and considerations when developing explainable AI methods.

There are many reasons why some form of interpretability in AI systems might be desirable or necessary. These include: giving users confidence that an AI system works well; safeguarding against bias; adhering to regulatory standards or policy requirements; helping developers understand why a system works a certain way, assess its vulnerabilities, or verify its outputs; or meeting society’s expectations about how individuals are afforded agency in a decision-making process.

Different AI methods are affected by concerns about explainability in different ways. Just as a range of AI methods exists, so too does a range of approaches to explainability. These approaches serve different functions, which may be more or less helpful, depending on the application at hand. For some applications, it may be possible to use a system which is interpretable by design, without sacrificing other qualities, such as accuracy.

There are also pitfalls associated with these different methods, and those using AI systems need to consider whether the explanations they provide are reliable, whether there is a risk that explanations might deceive their users, or whether they might contribute to gaming of the system or opportunities to exploit its vulnerabilities.

Different contexts give rise to different explainability needs, and system design often needs to balance competing demands – to optimise the accuracy of a system or ensure user privacy, for example. There are examples of AI systems that can be deployed without giving rise to concerns about explainability, generally in areas where there are no significant consequences from unacceptable results or the system is well-validated. In other cases, an explanation about how an AI system works is necessary but may not be sufficient to give users confidence or support effective mechanisms for accountability.

In many human decision-making systems, complex processes have developed over time to provide safeguards, audit functions, or other forms of accountability. Transparency and explainability of AI methods may therefore be only the first step in creating trustworthy systems and, in some circumstances, creating explainable systems may require both these technical approaches and other measures, such as assurance of certain properties. Those designing and implementing AI therefore need to consider how its use fits in the wider sociotechnical context of its deployment.

Table of Contents

  • Summary
  • AI and the black box
    • AI’s explainability issue
    • The black box in policy and research debates
    • Terminology
    • The case for explainable AI
  • Explainable AI: the current state of play
  • Challenges and considerations when implementing explainable AI
    • Different users require different forms of explanation in different contexts
    • System design often needs to balance competing demands
    • Data quality and provenance is part of the explainability pipeline
    • Explainability can have downsides
    • Explainability alone cannot answer questions about accountability
  • Explaining AI: where next?
    • Stakeholder engagement is important
    • Explainability might not always be the priority
    • Complex processes often surround human decision-making
  • Annex 1: A sketch of the policy environment