
From “no clear winner” to an effective Explainable Artificial Intelligence process: An empirical journey
Author(s) -
Dodge Jonathan,
Anderson Andrew,
Khanna Roli,
Irvine Jed,
Dikkala Rupika,
Lam KinHo,
Tabatabai Delyar,
Ruangrotsakun Anita,
Shureih Zeyad,
Kahng Minsuk,
Fern Alan,
Burnett Margaret
Publication year - 2021
Publication title -
applied ai letters
Language(s) - English
Resource type - Journals
ISSN - 2689-5595
DOI - 10.1002/ail2.36
Subject(s) - artificial intelligence , computer science , process (computing) , recall , action (physics) , cognition , empirical research , realization (probability) , psychology , machine learning , cognitive science , cognitive psychology , epistemology , mathematics , statistics , physics , quantum mechanics , neuroscience , operating system , philosophy
“In what circumstances would you want this AI to make decisions on your behalf?” We have been investigating how to enable a user of an Artificial Intelligence‐powered system to answer questions like this through a series of empirical studies, a group of which we summarize here. We began the series by (a) comparing four explanation configurations of saliency explanations and/or reward explanations. From this study we learned that, although some configurations had significant strengths, no one configuration was a clear “winner.” This result led us to hypothesize that one reason for the low success rates Explainable AI (XAI) research has in enabling users to create a coherent mental model is that the AI itself does not have a coherent model. This hypothesis led us to (b) build a model‐based agent, to compare explaining it with explaining a model‐free agent. Our results were encouraging, but we then realized that participants' cognitive energy was being sapped by having to create not only a mental model, but also a process by which to create that mental model. This realization led us to (c) create such a process (which we term After‐Action Review for AI or “AAR/AI”) for them, integrate it into the explanation environment, and compare participants' success with AAR/AI scaffolding vs without it. Our AAR/AI studies' results showed that AAR/AI participants were more effective assessing the AI than non‐AAR/AI participants, with significantly better precision and significantly better recall at finding the AI's reasoning flaws.