COSI 230B | Spring 2026

Natural Language Annotation for Machine Learning

Course Syllabus

Course Syllabus

Spring 2026 | Brandeis University

Course Description

This course covers the theory and practice of creating annotated datasets for natural language processing and machine learning. Students will learn annotation methodologies, quality measurement, and modern approaches including LLM-assisted annotation workflows.

Essential Logistics

Location & Time

Location: Volen National Center for Complex Systems, Room 106

Time: Mondays, Wednesdays, & Thursdays from 9:05 AM - 9:55 AM ET

Generally, sessions on Mondays & Wednesdays will be lectures led by Jin while Ricky will lead laboratory sessions on Thursdays.

Prerequisites

COSI 115b, or COSI 114a and COSI 115b (concurrent).

Communication

Course materials and announcements via MOODLE or email. Expect email responses within one business day.

Learning Objectives

By the end of this course, students will be able to:

  1. Design and implement annotation schemas for various NLP tasks
  2. Write clear annotation guidelines for both human annotators and LLM prompts
  3. Calculate and interpret inter-annotator agreement metrics
  4. Use modern annotation tools and platforms effectively
  5. Evaluate trade-offs between human annotation, LLM annotation, and hybrid approaches
  6. Create high-quality annotated datasets suitable for machine learning
  7. Understand preference data collection for RLHF and alignment

Grading

Component Weight
Participation 10%
Assignments (5 homework assignments) 40%
Semester Project 50%
Each homework assignment is of equal weight. Success in this course requires staying on top of the semester project.

Course Policies

Attendance

Attending lectures and laboratory sessions is mandatory. Lack of attendance will impact participation grades. Reasonable accommodations can be made for excused absences.

Late Homework Policy

Students will be allotted three grace days that can be applied toward homework assignments. No more than one grace day may be used per homework assignment without penalty. Late submissions accrue a 20% penalty per day.

Laptop and Cell Phone Use

You are welcome to use portable computing devices for learning purposes (e.g., taking notes or running course-related software). Refrain from using devices for other purposes during class.

Generative Language Model Use

Permitted uses: Pilot annotation to stress-test guidelines, error analysis, exploratory analysis of ambiguity or disagreement.

Disallowed uses: Submitting model-generated annotations as human-produced, generating assignment write-ups, using models to replace required human annotation work.

Any use of generative models must be clearly disclosed in assignment submissions. Undisclosed or inappropriate use constitutes a violation of the Brandeis academic integrity policy.

Academic Honesty

You are expected to be familiar with, and to follow, the University's policies on academic integrity. Please consult Brandeis University Rights and Responsibilities for all policies and procedures related to academic integrity.

Accommodations

If you think you may require disability accommodations, contact Student Accessibility Support (SAS) at 781-736-3470 or access@brandeis.edu.

If you already have an accommodation letter, please provide a copy as soon as possible.

Student Support

Brandeis University is committed to supporting all students. Resources include: