Course Syllabus
Spring 2026 | Brandeis University
Course Description
This course covers the theory and practice of creating annotated datasets for natural language processing and machine learning. Students will learn annotation methodologies, quality measurement, and modern approaches including LLM-assisted annotation workflows.
Essential Logistics
Location & Time
Location: Volen National Center for Complex Systems, Room 106
Time: Mondays, Wednesdays, & Thursdays from 9:05 AM - 9:55 AM ET
Generally, sessions on Mondays & Wednesdays will be lectures led by Jin while Ricky will lead laboratory sessions on Thursdays.
Prerequisites
COSI 115b, or COSI 114a and COSI 115b (concurrent).
Communication
Course materials and announcements via MOODLE or email. Expect email responses within one business day.
Learning Objectives
By the end of this course, students will be able to:
- Design and implement annotation schemas for various NLP tasks
- Write clear annotation guidelines for both human annotators and LLM prompts
- Calculate and interpret inter-annotator agreement metrics
- Use modern annotation tools and platforms effectively
- Evaluate trade-offs between human annotation, LLM annotation, and hybrid approaches
- Create high-quality annotated datasets suitable for machine learning
- Understand preference data collection for RLHF and alignment
Grading
| Component | Weight |
|---|---|
| Participation | 10% |
| Assignments (5 homework assignments) | 40% |
| Semester Project | 50% |
Course Policies
Attendance
Attending lectures and laboratory sessions is mandatory. Lack of attendance will impact participation grades. Reasonable accommodations can be made for excused absences.
Late Homework Policy
Students will be allotted three grace days that can be applied toward homework assignments. No more than one grace day may be used per homework assignment without penalty. Late submissions accrue a 20% penalty per day.
Laptop and Cell Phone Use
You are welcome to use portable computing devices for learning purposes (e.g., taking notes or running course-related software). Refrain from using devices for other purposes during class.
Generative Language Model Use
Permitted uses: Pilot annotation to stress-test guidelines, error analysis, exploratory analysis of ambiguity or disagreement.
Disallowed uses: Submitting model-generated annotations as human-produced, generating assignment write-ups, using models to replace required human annotation work.
Academic Honesty
You are expected to be familiar with, and to follow, the University's policies on academic integrity. Please consult Brandeis University Rights and Responsibilities for all policies and procedures related to academic integrity.
Accommodations
If you think you may require disability accommodations, contact Student Accessibility Support (SAS) at 781-736-3470 or access@brandeis.edu.
If you already have an accommodation letter, please provide a copy as soon as possible.
Student Support
Brandeis University is committed to supporting all students. Resources include: