Worksheet: Relation Extraction

Identifying semantic relationships between entities in text
Course: Natural Language Annotation for Machine Learning Task Type: Structured annotation (entity pairs + relations)
Author: Jin Zhao

Background

Relation extraction identifies semantic relationships between entities mentioned in text, creating structured knowledge from unstructured data.

Relation Triple: (Subject Entity, Relation Type, Object Entity)

Example: "Steve Jobs founded Apple" → (Steve Jobs, FOUNDED, Apple)

Common Relation Types

RelationDescriptionExample
WORKS_FOREmployment(John, WORKS_FOR, Google)
LOCATED_INPhysical location(Paris, LOCATED_IN, France)
BORN_INBirthplace(Einstein, BORN_IN, Germany)
FOUNDEDCreated organization(Gates, FOUNDED, Microsoft)
SPOUSEMarried to(Obama, SPOUSE, Michelle)
SUBSIDIARY_OFOwned by(Instagram, SUBSIDIARY_OF, Meta)

Entity Types

PERSON ORGANIZATION LOCATION DATE MISC

Part 1: Basic Relation Identification

Elon Musk is the CEO of Tesla, which is headquartered in Austin, Texas.
Question 1

Identify all relation triples in the sentence:

Triple 1: (, , )
Triple 2: (, , )

Part 2: Implicit vs. Explicit Relations

Dr. Sarah Chen, a professor at MIT, published her findings in Nature last month.
Question 2

Which relations are explicitly stated vs. inferred?

RelationExplicit or Inferred?
(Dr. Sarah Chen, WORKS_FOR, MIT)
(Dr. Sarah Chen, PUBLISHED_IN, Nature)
(Dr. Sarah Chen, PROFESSION, professor)

Part 3: Relation Directionality

Amazon acquired Whole Foods in 2017.
Question 3

What is the correct direction for the acquisition relation?

Part 4: Nested and Complex Relations

Bill Gates and Paul Allen co-founded Microsoft in Albuquerque.
Question 4

How should co-founding be represented?

What about the location relation?

Part 5: Negative and Uncertain Relations

John used to work at Google but now works at Meta. Mary might join Apple next year.
Question 5

How should these relations be annotated?

StatementRelation Status
(John, WORKS_FOR, Google)
(John, WORKS_FOR, Meta)
(Mary, WORKS_FOR, Apple)

Part 6: Annotate This Passage

Tim Cook, who succeeded Steve Jobs as CEO of Apple, announced that the company would invest $1 billion in North Carolina. Apple, based in Cupertino, has been expanding its operations across the United States.
Question 6

Extract all relation triples (aim for at least 5):

Part 7: Group Discussion

Question 7

Compare your annotations with your group. Where did you disagree?

Part 8: Reflection

Question 8

Why is relation extraction difficult?

Key Takeaway

Relation extraction turns unstructured text into structured knowledge, but requires many annotation decisions.

  • The relation schema determines what can be captured
  • Implicit relations require inference and increase annotator burden
  • Directionality and temporal aspects need clear guidelines
  • Entity identification and relation extraction are interdependent