Improbable Defence

Trusting a black box: explaining complex simulation outcomes using LIME

The field of Artificial Intelligence (AI) has recently been suffering an “interpretability crisis”: black-box techniques like deep learning produce impressively accurate predictions, but fail to offer any human intelligible explanation, making it hard to establish their safety and fitness-of-purpose in highly regulated or safety-critical domains.

In the adjacent field of modelling and simulation, this challenge is not new: complex simulations generate emergent outputs via the interaction of a vast amount of dynamic components, which can make the system as a whole opaque to the user. Additionally, their complexity typically renders these simulations sensitive to initial parameter settings or details of their initial state such as the exact placement of units on a map. This can be broadly quantified via classical techniques such as sensitivity analysis, which however fail to provide a true explanation of simulation outputs in terms of understandable features of the input space.

As a response to the interpretability crisis in AI, the new field of Explainable AI (XAI) has emerged in recent years. Techniques like Locally Interpretable Model Explanations (LIME) enable powerful post-hoc analysis of predictions that provides an intuition for the model’s logic. This is achieved by considering how small, local changes to the input configuration affect the response, capturing the resulting dependencies using interpretable statistical techniques, and reporting them in an intuitive graphical user interface. This closes the loop between the model and the user, allowing the user not only to build trust in the model, but also to actively improve it by identifying blind spots and misconceptions that are evident to a human expert.

In this work, we investigate the applications of XAI techniques to complex simulations. In a series of examples from the social sciences and operational research we use LIME to coherently trace emergent patterns back to the input space.