Resources
Textbooks, tools, and readings
Required Textbook
Natural Language Annotation for Machine Learning: A Guide to Corpus-Building for Applications
O'Reilly Media, 2012
Required Software
- Python 3.9+
- pandas, scikit-learn, transformers (Hugging Face)
- LLM API access (OpenAI, Anthropic, or open-source alternatives)
Annotation Tools
Label Studio (Recommended)
Modern, feature-rich annotation platform with LLM integration support
Free & Open SourceArgilla
Designed for RLHF data collection and feedback annotation
Free & Open Sourcebrat
Classic annotation tool for sequence labeling and relation annotation
Free & Open SourceProdigy
Commercial tool with active learning support (if licensed)
CommercialRecommended LLM Access
- OpenAI API: GPT-4, GPT-4o - platform.openai.com
- Anthropic API: Claude 3.5 Sonnet, Claude 3 Opus - anthropic.com/api
- Open-source: Llama 3, Mistral via Hugging Face or local deployment
Foundational Readings
Inter-annotator Agreement
Handbook of Linguistic Annotation, 2017
Computing Krippendorff's Alpha-Reliability
2011
The Benefits of a Model of Annotation
TACL, 2014
LLM-Based Annotation
ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks
arXiv:2303.15056, 2023
Large Language Models for Data Annotation and Synthesis: A Survey
EMNLP, 2024
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators
arXiv:2303.16854, 2023
Automated Annotation with Generative AI Requires Validation
arXiv:2306.00176, 2023
RLHF and Preference Learning
Training language models to follow instructions with human feedback
NeurIPS, 2022
Constitutional AI: Harmlessness from AI Feedback
arXiv:2212.08073, 2022
Direct Preference Optimization
NeurIPS, 2023
LLM-as-Judge
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
arXiv:2306.05685, 2023
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
EMNLP, 2023
Low-Resource and Multilingual
MasakhaNER: Named Entity Recognition for African Languages
TACL, 2021
MEGA: Multilingual Evaluation of Generative AI
EMNLP, 2023
Brandeis Library Resources
The Brandeis Library offers resources and services including:
- Research & Instructional Services
- Digital Scholarship Lab
- Citation and research assistance