Behind the Paper

Escaping the “Interpreter’s Trap”: When Explainable AI Fails to Protect Justice

My debut paper challenges the growing belief that "Explainable AI" (XAI) equates to safety. Driven by concerns over black-box opacity, I investigate whether this technical optimism causes us to overlook fundamental aspects of human responsibility and accountability.

Published in Computational Sciences, Law, Politics & International Studies, and Philosophy & Religion

Feb 11, 2026

Keita Sode

Japanese Researcher, Kaijo School

Escaping the “Interpreter’s Trap”: When Explainable AI Fails to Protect Justice

Liked by India Ambler and 1 other

Explore the Research

SpringerLink

The interpreter’s trap: how explainable AI launders uncertainty into justification—a socio-legal case study of COMPAS risk assessment - AI & SOCIETY

This paper introduces the “Interpreter’s Trap,” a socio-legal and STS-informed (Here, “STS-informed” refers to a Science and Technology Studies orientation that treats algorithmic categories (e.g., “risk”) and their evidentiary status as sociotechnical accomplishments shaped by institutional practices, rather than as neutral representations of a stable ground truth.) framework developed through a concept-generative case study of the COMPAS recidivism risk assessment tool. Moving beyond individual-level cognitive-deficit accounts of reliance, I argue that the trap is best understood as an epistemic double bind that reallocates justificatory burdens under constrained contestability and asymmetric liability—even when decision-makers retain nominal discretion and may deviate from the algorithm’s outputs. Interpreters are positioned between a “contaminated objectivity” rooted in proxy-laden predictions and disputed ground truth, and an “eroded subjectivity,” where exercising discretion becomes institutionally costlier by concentrating the de facto institutional burden of justification and potential blame on the individual judge. The analysis shows that explainable AI (XAI) often provides technical rationales without supplying normative reasons that are legally reviewable, instead functioning as a catalyst for “accountability washing” by offering convenient narratives that mask the system’s “complexity illusion.” In the age of Generative AI, language-model explanation layers may intensify the trap by producing persuasive yet unfaithful rationales. I therefore advocate a paradigm shift from explaining black boxes to designing inherently interpretable “glass-box” models—particularly for structured-data, high-stakes normative decisions—as a necessary (though not sufficient) precondition to realize the legal “Right to Contest” and enable meaningful Human Oversight, paired with contestation infrastructures. This paper contributes (1) a meso-level mechanism of justificatory risk allocation—under constrained contestability and asymmetric liability—showing how post hoc XAI supplies defensibility more than understanding; and (2) governance implications: why meaningful Human Oversight in high-stakes normative decisions requires design-based interpretability and contestation infrastructure.

My investigation led me to the concept of "The Interpreter’s Trap." In studying risk assessment tools like COMPAS, I realized that the problem wasn't merely technical opacity, but a structural "institutional double bind." Decision-makers are often caught between "contaminated objectivity"—data carrying hidden historical biases and proxy variables—and "eroded subjectivity," where their own professional discretion is devalued against the machine’s "evidence-based" output.

Crucially, my research suggests that this trap is sustained by a mechanism of "liability shielding." I found that decision-makers face an asymmetry of risk: aligning with a high-risk score creates a safe harbor of "scientific" justification, whereas deviating from the algorithm imposes a heavy personal burden of proof. In this high-stakes environment, post-hoc XAI often fails to empower human oversight. Instead, it creates "convenient narratives"—simplified rationales like "prior arrests"—that anchor the judge’s decision to the algorithm. This effectively converts uncertainty into institutionally defensible justifications, facilitating what I term "accountability washing."

As this was my first peer-reviewed article, the research process itself was a steep learning curve. While the initial drafts were unpolished, rigorous feedback from reviewers helped me refine my critique from simple technological skepticism into a robust socio-legal framework. This process taught me that the "trap" is not just about the algorithm, but about the specific legal and organizational structures that incentivize deference to machines.

Ultimately, I argue that we should aspire to a higher standard of system design. We must pierce the "Complexity Illusion"—the false assumption that opaque, complex models are inherently more accurate. For high-stakes normative decisions, "explaining" a black box is often insufficient. I advocate for a paradigm shift toward inherently interpretable "glass-box" models—systems designed to be transparent from the ground up, where the logic is visible and contestable by default.

However, this is not just a technical challenge—it is a legal and normative one too. Transparency alone is meaningless if the human in the loop cannot act upon it. My paper argues that true "Human Oversight," as envisioned in emerging frameworks like the EU AI Act, requires more than just receiving an explanation; it requires the effective power to disagree. We must couple interpretable models with a robust "Right to Contest," creating a socio-legal infrastructure where algorithmic outputs are treated not as objective verdicts, but as contestable evidence. Without this normative shift, even the most transparent model risks becoming another tool for bureaucratic validation rather than justice.

As an early-career researcher, I hope this paper contributes to the ongoing conversation about how we can build AI systems that support, rather than supplant, human ethical judgment.

Read the full paper: https://rdcu.be/e2MXo

Keita Sode (He/Him)

Japanese Researcher, Kaijo School

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Artificial Intelligence

Mathematics and Computing > Computer Science > Artificial Intelligence

IT Law, Media Law, Intellectual Property

Humanities and Social Sciences > Law > IT Law, Media Law, Intellectual Property

Computer Ethics

Humanities and Social Sciences > Philosophy > Moral Philosophy and Applied Ethics > Information Ethics > Computer Ethics

Computers and Society

Mathematics and Computing > Computer Science > Computing Milieux > Computers and Society

Philosophy of Artificial Intelligence

Humanities and Social Sciences > Philosophy > Philosophy of Science > Philosophy of Technology > Philosophy of Artificial Intelligence

AI & SOCIETY

AI & SOCIETY

This journal focuses on societal issues including the design, use, management, and policy of information, communications and new media technologies, with a particular emphasis on cultural, social, cognitive, economic, ethical, and philosophical implications.

More about the journal

Latest Content

Behind the Paper

Can you model treatment-resistant depression in a mouse?

Behind the Paper

On the expansion and evolution of a master’s project

News and Opinion

February 2026 Highlights from the Health & Clinical and Life Sciences Research Communities: Rare Disease Day

From the Editors

Q&A for World Obesity Day 2026

Behind the Paper, From the Editors

Designing Carbon Catalysts for Green Hydrogen Peroxide Electrosynthesis

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

Escaping the “Interpreter’s Trap”: When Explainable AI Fails to Protect Justice

Share this post

Share with...

...or copy the link