COSI 230B | Spring 2026

Natural Language Annotation for Machine Learning

In-Class Activities

Annotation Worksheets

Interactive exercises to understand annotation challenges

About These Worksheets

These interactive worksheets help you understand the challenges of annotation for various NLP tasks. Each worksheet contains hands-on exercises where you'll annotate real examples, compare your decisions with classmates, and reflect on sources of disagreement.

How to use: Complete the worksheet during class, enter the session code provided by your instructor, and submit your answers to see how the class responded.

Instructor Dashboard

View real-time student responses, agreement statistics, and text answers during class.

Open Dashboard

Traditional NLP Tasks

13 worksheets

Toxicity Detection

Label comments as toxic or not toxic. Explore how sarcasm and context affect interpretation.

Classification

Sentiment Analysis

Determine positive, negative, or neutral sentiment. Handle mixed emotions and implicit opinions.

Classification

Sarcasm Detection

Identify sarcastic statements. Learn why context and tone are crucial for detection.

Classification

Emotion Detection

Classify text into emotions like joy, anger, fear. Handle multi-label and intensity challenges.

Classification

Stance Detection

Determine if text supports, opposes, or is neutral toward a target. Explore implicit stances.

Classification

Named Entity Recognition

Identify and classify entities like people, organizations, locations. Handle boundary decisions.

Sequence Labeling

Event Extraction

Identify events and their participants. Understand triggers and argument structure.

Sequence Labeling

Semantic Role Labeling

Label who did what to whom. Identify agents, patients, and other semantic roles.

Sequence Labeling

Coreference Resolution

Link pronouns and mentions to entities. Handle ambiguous references and bridging.

Sequence Labeling

Temporal Annotation

Mark time expressions and temporal relations. Normalize dates and durations.

Sequence Labeling

Relation Extraction

Identify relationships between entities. Handle implicit and multiple relations.

Classification

Causal Relations

Identify cause-effect relationships. Distinguish correlation from causation.

Classification

Word Sense Disambiguation

Choose the correct meaning of ambiguous words. Explore sense granularity challenges.

Classification

LLM-Related Tasks

12 worksheets

LLM Preference (RLHF)

Compare model outputs and choose the better response. The foundation of RLHF training.

LLM

Hallucination Detection

Identify when LLMs generate false or unsupported information. Critical for AI safety.

LLM

Prompt Quality Assessment

Evaluate prompt clarity, specificity, and safety. Learn what makes prompts effective.

LLM

Summarization Evaluation

Assess summary quality on faithfulness, relevance, and coherence dimensions.

Evaluation

Translation Quality

Evaluate machine translation accuracy, fluency, and meaning preservation.

Evaluation

Text Simplification

Evaluate if simplified text preserves meaning while improving readability.

Evaluation

Code Generation Review

Evaluate LLM-generated code for correctness, efficiency, and best practices.

LLM

Question Answering

Select answer spans and handle unanswerable questions. Build QA datasets.

Classification

Dialogue Act Classification

Label conversational intents: questions, statements, requests. Build chatbot training data.

Classification

Argumentation Mining

Identify claims, premises, and argument structure. Evaluate reasoning quality.

Classification

Bias Detection

Identify various forms of bias in text: gender, racial, political, and more.

LLM

Implicit Hate Speech

Detect subtle, coded, and implicit forms of hateful content.

Classification