Welcome!

Hi everyone! I'm incoming PhD student at the Department of Computer Science of University of Toronto, where I'll be working with Professor Ashton Anderson. My research focus on generative models, human-AI alignment, and mechanistic interpretability. I am interested in developing AI systems that are effective and interpretable by human beings. In my daily life, I am deeply enthusiastic in the sports of Go, basketball, and tennis. Looking forward to connecting with you!

Publications

[ACL 2024] SPIN: Sparsifying and Integrating Internal Neurons in Large Language Models for Text Classification

Difan Jiao, Yilun Liu, Zhenwei Tang, Daniel Matter, Jürgen Pfeffer, Ashton Anderson

Among the many tasks that Large Language Models (LLMs) have revolutionized is text classification. Current text classification paradigms, however, rely solely on the output of the final layer in the LLM... Read more

[NeurIPS 2024] Maia-2: A Unified Model for Human-AI Alignment in Chess

Zhenwei Tang, Difan Jiao, Reid McIlroy-Young, Jon Kleinberg, Siddhartha Sen, Ashton Anderson

There are an increasing number of domains in which artificial intelligence (AI) systems both surpass human ability and accurately model human behavior... Read more

[COLM 2025] SEAM: Semantically Equivalent Across Modalities Benchmark for Vision-Language Models

Zhenwei Tang, Difan Jiao, Blair Yang, Ashton Anderson

The rapid advancement of large vision-language models (VLMs) has introduced challenges in evaluating their reasoning across multiple modalities... Read more

[Under Review] Understanding Mechanisms of Skill Adaptation in Generative Models: Chess as a Model System

Difan Jiao, George Eilender, Zhenwei Tang, Ashton Anderson

TMLR 2025 Submission

Generative models exhibit a remarkable ability to adapt their outputs to different skill levels, ranging from beginner to expert in various domains... Read more

[Under Review] Learning to Imitate with Less: Efficient Individual Behavior Modeling in Chess

Zhenwei Tang, Difan Jiao, Eric Xue, Reid McIlroy-Young, Jon Kleinberg, Siddhartha Sen, Ashton Anderson

TMLR 2025 Submission

As humans seek to collaborate with, learn from, and better understand artificial intelligence systems, developing AIs that can accurately emulate individual decision-making becomes increasingly important... Read more