I’m interested in the conceptual foundations of machine learning and the ethical, epistemic, and social implications of AI systems. I also use computational models to study opinion dynamics and collaboration. Below are some of my recent and ongoing projects:
What Do We Really Want from Interpretability?
Interpretability is one of the most discussed topics in machine learning today, but what additional information do we actually need from a model, beyond its predictions? The answer depends on our goals. My research explores the conceptual landscape of interpretability, with several ongoing projects:
SSHRC Insight Development Grant
Title: Beyond the Accuracy-Interpretability Tradeoff: Optimizing Human-AI Collaborations
(With Chris Smeenk)
This project investigates what kinds of information, beyond predictions, can enhance human-AI team performance. We also draw lessons from physics to explore how we might establish the reliability of models we don’t fully understand.
Caution Against Model Simplicity
I’m currently writing a paper arguing that simple, inherently interpretable models are not necessarily better at achieving our ethical goals, because functional form is not what is at issue.
Counterfactual Explanations and Contestability
(With Thomas Grote)
We examine whether counterfactual explanations genuinely empower individuals to contest automated decisions — and if so, how?
Can Social Media Unpolarize Us?
While it is often believed that polarization dominates public discourse, some evidence suggests that society is not as polarized as it appears. There’s also a puzzling correlation between opinions on unrelated topics (e.g., anti-climate action and pro-gun stances), suggesting that our multi-dimensional opinion space has collapsed into just two perceived dimensions.
I study how misperceptions of opinion space contribute to polarization, and whether dynamically configuring social connections could help restore a multi-dimensional opinion space.
Other Works in Progress
- The illusion of ethical-epistemic tradeoff (draft available)
- A paper on gaslighting (draft available)
- A paper on rewards in science (draft available)
- A paper on hypes (draft available)