COSI 230B | Spring 2026

Natural Language Annotation for Machine Learning

Learn the art and science of creating high-quality training data for NLP systems

Time: Mon/Wed/Thu 9:05-9:55 AM Location: Volen 106

Announcements

January 26, 2026

Welcome to COSI 230B!

Welcome to Natural Language Annotation for Machine Learning! This course covers the theory and practice of creating annotated datasets for NLP and machine learning, including modern approaches with LLM-assisted workflows.

Please review the syllabus and come prepared for our first class on January 14th.

January 14, 2026

HW 0 Assigned

Homework 0 (Dataset Exploration) has been assigned. Due: January 21st.

Please check MOODLE for the assignment details and submission link.

About This Course

This course covers the theory and practice of creating annotated datasets for natural language processing and machine learning. Students will learn annotation methodologies, quality measurement, and modern approaches including LLM-assisted annotation workflows.

Prerequisites

COSI 115b, or COSI 114a and COSI 115b (concurrent).

Learning Objectives

  • Design and implement annotation schemas for various NLP tasks
  • Write clear annotation guidelines for humans and LLM prompts
  • Calculate and interpret inter-annotator agreement metrics
  • Use modern annotation tools and platforms effectively
  • Evaluate trade-offs between human and LLM annotation
  • Understand preference data collection for RLHF

Quick Links

Instructors

JZ

Jin Zhao (Lecturer)

Pronouns: She/her/hers

Email: jinzhao@brandeis.edu

Office: Volen 109

Office Hours: Wednesdays 1-3pm ET

RB

Richard Brutti (TA)

Pronouns: He/him

Email: brutti@brandeis.edu

Office: Abelson Lower Level

Office Hours: Thursdays after lab or by appointment

Important Dates

Upcoming: HW 0 due January 21 | Form project groups by January 28
Date Event
Jan 19 MLK Jr. Day - No Class
Jan 21 HW 0 Due, HW 1 Assigned
Jan 28 HW 1 Due, Form Project Groups
Feb 16-18 February Break - No Classes
Apr 6-8 Passover Break - No Classes
Apr 30 - May 12 Final Presentations