COSI 230B | Spring 2026

Natural Language Annotation for Machine Learning

Lecture Materials

Lecture Materials

Slides and resources for each lecture

Note: Lecture slides are released after each class. Check MOODLE for any supplementary materials.

Week 1: Introduction (Jan 12, 14)

Week 2: When to Annotate (Jan 21)

Lecture 3: When to Annotate | Tools & Formats

Rule-based vs. ML approaches, decision framework, annotation tool landscape, data formats

PDF

Week 3: Corpus & Data (Jan 28)

Lecture 4: Corpus Selection & Data Sourcing

MAMA criteria, sampling strategies, licensing, synthetic data generation

PDF

Week 4: What Models Learn (Feb 2, 4)

Lectures 5 & 6: What Models Learn from Annotation

How annotations shape model behavior, data-centric AI, annotation artifacts

PDF

Week 5: Design Pipeline & IAA I (Feb 9, 11)

Week 7: IAA II & IAA III (Feb 23, 25)

Week 8: IAA in the LLM Era & Annotator Reliability (Mar 2, 4)

Week 10: Annotator Reliability II & Annotation Projects (Mar 16, 18)

Week 11: Annotation Projects & Supervision Engineering (Mar 23, 25)

Week 12: Instruction Annotation (Mar 30)

Lecture 17: Instruction Annotation

Instruction-tuning datasets, task contract specification, template leakage, mixture engineering

PDF LLM

Week 13: Preference Annotation & Reasoning (Apr 13, 15)

Week 14: Synthetic Annotation & Active Learning (Apr 20, 22)

Week 15: Evaluation & Multilingual Annotation (Apr 27, 29)