Worksheet: Coreference Resolution

Linking mentions that refer to the same entity
Course: Natural Language Annotation for Machine Learning Task Type: Structured annotation (clustering)
Author: Jin Zhao

Background

Coreference resolution identifies when different expressions in text refer to the same real-world entity.

Coreference: Two or more expressions (mentions) refer to the same entity in the world.

Mention: A noun phrase, pronoun, or name that refers to an entity.

Cluster: A group of mentions that all refer to the same entity.

Types of Referring Expressions

Part 1: Basic Coreference

[Barack Obama] was the 44th president. [He] served two terms. [Michelle Obama] was [his] wife. [She] was also a lawyer.
Cluster 1
Cluster 2
Question 1

The example above shows two coreference clusters. List the mentions in each:

Cluster 1 (Barack Obama):
Cluster 2 (Michelle Obama):

Part 2: Ambiguous Pronouns

[John] told [Bill] that [he] would be late.
Question 2

Who does "he" refer to?

Part 3: Generic vs. Specific Reference

[Dogs] are loyal animals. [My dog] certainly is. [He] follows me everywhere.
Question 3

Does "Dogs" (generic reference to all dogs) corefer with "My dog" (specific dog)?

Part 4: Bridging and Near-Coreference

I visited [a restaurant] last night. [The food] was excellent, but [the waiter] was rude.
Question 4

"The food" and "the waiter" are related to "a restaurant" but don't refer to the same entity. This is called bridging. Should these be linked?

Part 5: Events and Abstract Entities

[The company announced layoffs]. [This] shocked employees. [The decision] was made by [the board].
Question 5

Do "The company announced layoffs", "This", and "The decision" all refer to the same thing?

Part 6: Winograd Schema Challenge

[The city councilmen] refused to give [the demonstrators] a permit because [they] feared violence.
Question 6a

Who does "they" refer to?

[The city councilmen] refused to give [the demonstrators] a permit because [they] advocated violence.
Question 6b

Now who does "they" refer to?

Part 7: Annotate This Passage

[Apple] released [a new iPhone] yesterday. [The company] said [it] would be available next month. [Tim Cook] presented [the device] at [their headquarters]. [He] called [it] "revolutionary."
Question 7

Identify all coreference clusters:

Cluster 1:
Cluster 2:
Cluster 3:
Cluster 4:

Part 8: Group Discussion

Question 8

Where did your group disagree?

Part 9: Reflection

Question 9

Why is coreference resolution difficult?

Key Takeaway

Coreference is about reference, not similarity—and reference depends on context, knowledge, and inference.

  • Two expressions can be very different but refer to the same entity
  • Resolution often requires reasoning about the world, not just text
  • The boundary between coreference and related concepts (bridging, generics) is fuzzy