CMSC848R: Selected Topics in Information Processing; Language Model Interpretability

Fall 2025
Tuesdays and Thursdays, 12:30pm to 1:45pm
AJC (Clark Hall) 2132



Sarah Wiegreffe

Instructor

lm-interp@umd.edu (to reach both of us)

Ming Li

Teaching Assistant


Office hours:

  • Instructor: Sarah Wiegreffe
    Pronouns: she/her
    Office Hours: 1:45-2:45pm Thurs (right after class in IRB 4210; starting 09/11)
  • Teaching Assistant: Ming Li
    Pronouns: he/his
    Office Hours: 4:00-5:00pm Tues (IRB 2108; starting 09/09)

Resources:

[Syllabus] [Piazza] [Course Prospective Form]
[Presentation Signup and Paper Reading List]
[Course feedback form]

Course description:

This course focuses on state-of-the-art methods for interpreting language models and understanding their learned behaviors. We will discuss approaches centered on both understanding models’ internal mechanisms/representations and attributing behaviors back to the training data. We will focus on model tendencies including hallucination, factuality, memorization, and explanation/reasoning elicitation. If time allows, we will discuss recent developments in ameliorating learned behaviors, such as model editing, unlearning, and steering.

We will examine the current state-of-the-art methods, their limitations, and the ongoing efforts to address these challenges. Through this course, you will engage in paper discussions and gain a deeper understanding of the latest developments in the field and contribute to the ongoing discussions and research in this exciting area.

Schedule

Date Notes & Deadlines
September 2 (Tues) Slides
September 4 (Thurs) Deadline to submit prospective (11:59pm)
Slides
September 9 (Tues) Deadline to sign up for presentation slots (11:59pm)
Slides
September 11 (Thurs)
September 16 (Tues) Logistics Slides
Project Group Size Request Form
September 18 (Thurs)
September 23 (Tues)
September 25 (Thurs) P0 Due
September 30 (Tues)
October 2 (Thurs)
October 7 (Tues)
October 9 (Thurs) No class or office hour (Sarah traveling to COLM)
October 14 (Tues) No Class (Fall Break)
October 16 (Thurs)
October 21 (Tues)
October 23 (Thurs)
October 28 (Tues)
October 30 (Thurs)
November 4 (Tues)
November 6 (Thurs)
November 11 (Tues)
November 13 (Thurs)
November 18 (Tues)
November 20 (Thurs)
November 25 (Tues)
November 27 (Thurs) No Class or Office Hour (Thanksgiving Break)
December 2 (Tues)
December 4 (Thurs) No class or office hour (Sarah traveling to NeurIPS)
December 9 (Tues) Retrospective due (11:59pm)
December 11 (Thurs) Last Day of Class
December 19 (Friday) Deadline (11:59pm) for final project reports (in lieu of final exam)
Note: This is a tentative schedule, and subject to change as necessary - monitor the course ELMS page for current deadlines. In the unlikely event of a prolonged university closing, or an extended absence from the university, adjustments to the course schedule, deadlines, and assignments will be made based on the duration of the closing and the specific dates missed.