To enable trustworthy AI deployment, we must first understand the agents we build. Our research runs deep on two parallel fronts — one for AI safety, one for personalization.
To enable trustworthy deployment of AI agents, we need to first understand them.
To enable trustworthy deployment, we need to first understand the AI agents.
AI agents are increasingly capable — but capability without understanding is risk. Our first research thread investigates the internal workings of AI systems: how representations form, how features can be decoded and controlled, how failure modes can be identified automatically, and how human concepts map onto machine cognition. This line of work underpins safe and trustworthy AI deployment.
Building agents that truly know the person they serve.
Large language models are general. People are not. Our second thread develops the theory and engineering foundations for making LLM-based agents genuinely personal — building persistent user models, context-aware adaptation mechanisms, and trust-calibrated behavior that respects individual differences across a lifetime of use.
These problems require minds from cognitive science, ML, HCI, neuroscience, and beyond. We are actively seeking research collaborators, visiting scholars, and industry partners who want to work on the hard questions of human-AI symbiosis and AI safety.