posted on 2024-05-01, 18:21authored byZachariah Carmichael
Canonical AI algorithms are black boxes. When it comes to high-stakes applications, this is highly undesirable—we need algorithms that we can understand and, in turn, trust. Doctors will not, and should not, trust a machine that cannot reason about its decisions. The push for this explainability has led to the emergence of the explainable AI (XAI) sub-field. While XAI has made progress over the years, there still exists a wealth of issues. In this dissertation, we demonstrate a path toward the glass box ideal: highly performant algorithms that are fully human-comprehensible. We first demonstrate that post hoc explainers, while incredibly popular, should not be trusted. In certain cases in which they must be used, we provide a solution that helps the explainers become more reliable. We characterize the open problems of building an AI system feasible for high-stakes applications through the design of a real-world emergency drone response system. This motivates us to challenge the fundamentals of deep learning architectural design. We bridge XAI and automated machine learning (AutoML) to discover intrinsically debuggable neural networks. Next, we propose novel, intrinsically interpretable approaches to computer vision based on prototypical networks, a type of concept-based neural network. Critically, we address the human-machine semantic similarity gap associated with learned prototypes. First, we enable the automatic learning of prototypical parts with weak supervision using receptive-field constrained networks. Second, we enrich the interpretability of learned concepts by learning prototypical distributions in the invertible latent space of a normalizing flow. We demonstrate a measurable improvement in the comprehensibility of decisions made in both predictive and generative computer vision tasks. The contributions of this dissertation open up the AI black box by substituting correlative explainers with faithful, actionable, intuitive, and debuggable algorithms.