Hide Advanced Options
Courses - Fall 2025
CMSC
Computer Science Department Site
Open Seats as of
10/15/2025 at 05:30 PM
CMSC848R
Selected Topics in Information Processing; Language Model Interpretability
Credits: 3
Grad Meth: Reg
Must be in the Graduate Program in Computer Science. All other graduate students must request permission.

The course focuses on state-of-the-art methods for interpreting language language models and understanding their learned behaviors. We will discuss approaches centered on both understanding models internal mechanisms/representations and attributing behaviors back to thee training data. We will focus on model tendencies including hallucination, factuality, memorization, and explanation/reasoning elicitation. If time allows, we will discuss recent developments in ameliorating learned behaviors, such as model editing, unlearning, and steering.