COSI 230B | Spring 2026

Natural Language Annotation for Machine Learning

Homework Assignments

Homework Assignments

5 assignments worth 40% of your final grade

Assignment Policy

Each homework assignment has equal weight. Assignments are announced and submitted via MOODLE.

Late Policy: You have three grace days total for the semester. No more than one grace day per assignment. Additional late submissions receive a 20% penalty per day.

Homework 0: Dataset Exploration

Assigned: Jan 14 (Week 1) Due: Jan 21 (Week 2)

Description: Explore existing annotated NLP datasets using a provided worksheet. Analyze datasets to understand annotation schemas, data formats, and task characteristics.

Deliverables: Completed Excel worksheet with dataset analysis

Skills: Dataset analysis, understanding annotation schemas, data format familiarity

Homework 1: Annotation Tools Exploration

Assigned: Jan 21 (Week 2) Due: Jan 28 (Week 3)

Description: Hands-on experience with annotation tools and data formats.

Tasks:

Deliverables:

Homework 2: Data Wrangling with Pandas

Assigned: Mar 9 (Week 9) Due: Mar 18 (Week 10)

Description: Implement an NLPDataFrame class that extends pandas for NLP annotation tasks.

Tasks:

Deliverables:

Homework 3: Inter-Annotator Agreement

Assigned: Mar 23 (Week 11) Due: Mar 30 (Week 12)

Description: Calculate agreement metrics for annotation data.

Tasks:

Deliverables: Written solutions showing all calculations and work

Homework 4: Sentiment Analysis Fine-tuning

Assigned: Apr 13 (Week 13) Due: Apr 22 (Week 14)

Description: Fine-tune a sentiment classifier on annotated movie review data.

Tasks:

Deliverables: Completed Jupyter notebook with code, results, and analysis

Submission Guidelines