Semester Project
50% of your final grade
Overview
Groups of 3-4 students will complete a semester-long annotation project following the MATTER/MAMA cycle methodology. This is an essential component of the course where you will:
- Design an annotation specification and guidelines for an NLP task
- Annotate a dataset with your team
- Evaluate inter-annotator agreement
- Refine task guidelines based on disagreement analysis
- Train and evaluate baseline NLP models
- Report your findings
Project Documents
Getting to Know Your Group
Before diving into the project, take time to establish your team:
- Find out your common interests in annotation tasks
- Find regular times to meet in person outside of class — this is critical for project quality
- Identify what skills and resources each person brings:
- Knowledge of potential datasets
- Ideas for initial schema, workflows, etc.
- Specific skills (programming, data processing, writing, project management, etc.)
Project Timeline
Group Contract
A document assigning responsibility for specific tasks within the project. Divide work by skill — every member participates in annotation. Include group members, skills, task assignments, communication plan, meeting schedule, and data sharing plan. Submit signed document on LATTE.
Draft Annotation Schema
In-class presentation and write-up on your topic, dataset, and goals. Include an intuitive design for tagset and attributes, a small pilot annotation, and a brief literature review. Schema does not need to be fully formalized yet.
Annotation Goal Presentations
10-minute group presentation covering: planned dataset and sampling technique, document balance (genre, time period, authorship), data source and size, copyright/licensing, existing annotations, planned task and goals, and discussion of 2-3 relevant papers.
MAMA/MATTER Cycle
Perform parallel, independent pilot annotation on a subset of your corpus. Measure inter-annotator agreement (IAA), review disagreements, consider task complexity. Revise guidelines and repeat until you achieve satisfactory IAA.
Full Annotation Specification
Formal annotation schema, v1.0 of guidelines, and presentation. The specification must be clear enough for non-group annotators, with relevant examples (positive and negative) and expected edge cases. Schema should be operationalized within an annotation tool.
Full Annotation Task
Your group will not annotate its own task — each group will be given the schema and specification from another group. Grading is based on meeting deadlines and following the other group's instructions.
Annotation Report & Final Presentation
In-class presentation, ACL-style paper (4 pages), and peer evaluation.
Deliverables
Final Paper (ACL-style, 4 pages)
- Overview of task goals and annotation specification
- Characterization of your dataset (data distribution, annotation distribution, etc.)
- Difficulties during data collection (solved and unsolved), with possible improvements for future iterations
- Annotation quality:
- Quantitative analysis of annotation reliability and interpretation
- Qualitative analysis of annotator disagreements
- Machine learning experiment:
- Experimental design — baseline system, baseline features, features engineered from annotations
- Experimental results
Final Presentation
- 15-minute in-class presentation
- Present annotation task, methodology, and results
- Q&A with class
Dataset Submission
- Annotated corpus with documentation
- Annotation guidelines document
- Data in standard format (JSON, XML, or specified format)
Peer Evaluation
- Evaluate group members' contributions
- Individual accountability component
- Submitted confidentially
Grading Rubric
| Component | Weight | Description |
|---|---|---|
| Annotation Schema Design | 20% | Clarity, completeness, and appropriateness of the schema |
| Guidelines Quality | 20% | Clear, comprehensive, with good examples |
| Dataset Quality | 20% | Consistency, coverage, proper adjudication |
| IAA Analysis | 15% | Appropriate metrics, thoughtful analysis of disagreements |
| Final Paper | 15% | Writing quality, organization, academic rigor |
| Presentation | 10% | Clarity, engagement, Q&A handling |
Project Ideas
Here are some potential annotation tasks to consider:
Traditional NLP Tasks
- Named entity recognition for a specific domain
- Sentiment analysis with aspect-level annotations
- Relation extraction for a knowledge domain
- Event detection and argument extraction
- Coreference resolution for a specific genre
LLM-Related Tasks
- Preference annotation for chatbot responses
- Hallucination detection in LLM outputs
- Safety/toxicity annotation
- Instruction-following evaluation
- Code generation quality assessment
Resources
- ACL Paper Format Templates
- Annotation Tools
- Dataset sources: Hugging Face, Kaggle, LDC, academic papers