Codeq NLP API Tutorial 8

Part 8. Semantic Roles

By Rodrigo Alarcón, Computational Linguist

In this tutorial of Codeq’s NLP API we will focus on a single module that can be used to extract semantic roles from texts. Previous tutorials of this series can be found here:

The complete list of modules we offer can be found in our documentation:

Codeq NLP API Documentation

Define a NLP pipeline and analyze a text

As usual, the first step is to declare an instance of the NLP API client and use it to send a text along with a pipeline variable indicating the name of the Semantic Roles annotator. The output is a Document object that contains a list of analyzed Sentences. A quick look of the output can be found with the method document.pretty_print().

Copy to Clipboard

Semantic Role Labelling

The goal of this module is to identify the main events and participants in sentences and classify the different types of relations between them. In the extraction of Semantic Roles, events are called predicates, while the participants are known as the arguments of a given predicate. Arguments can denote specific types of relations, for example they can be an Agent, a Patient or a Location in relation to the predicate.

More details about the Semantic Role Labeler and an example of its application can be found here:

Exploring CORD-19 with Codeq NLP API and Semantic Roles

KEY: semantic_roles
ATTR: sentence.semantic_roles

Output Labels:

Agent/Experiencer
Patient/Theme
Instrument/Beneficiary/Goal
StartingPoint/Attribute
EndingPoint
Location
Purpose
Cause
Temporal
Modifier
Negative
GenericArgument

Copy to Clipboard

pipe = [
    "semantic_roles"
]

text = "A pneumonia outbreak was reported in Wuhan, China in December 2019."

document = client.analyze(text, pipeline=pipe)

for sentence in document.sentences:
    raw_sentence = sentence.raw_sentence
    semantic_roles = sentence.semantic_roles

print(raw_sentence)
    for sr in semantic_roles:
        predicate_lemma = sr['predicate_lemma']
        predicate_token = sr['predicate_token']
        predicate_position = sr['predicate_position']
        print("")
        print("predicate_lemma: %s" % predicate_lemma)
        print("predicate_token: %s" % predicate_token)
        print("predicate_position: %s" % predicate_position)
        if 'arguments' in sr:
            print("arguments:")
            for arg in sr['arguments']:
                arg_type = arg['type']
                arg_tokens = arg['tokens']
                arg_tokens_position = arg['positions']
                print("- type: %s" % arg_type)
                print("- tokens: %s" % arg_tokens)
                print("- positions: %s\n" % arg_tokens_position)

# Output:
# 
# sentence: A pneumonia outbreak was reported in Wuhan, China in December 2019.
# 
# semantic_roles:
# 
# predicate_lemma: be
# predicate_token: was
# predicate_position: 4
# 
# predicate_lemma: report
# predicate_token: reported
# predicate_position: 5
# arguments:
# - type: Patient/Theme
# - tokens: ['A', 'pneumonia', 'outbreak']
# - positions: [1, 2, 3]
# 
# - type: Location
# - tokens: ['in', 'Wuhan', ',', 'China']
# - positions: [6, 7, 8, 9]
# 
# - type: Temporal
# - tokens: ['in', 'December', '2019']
# - positions: [10, 11, 12]

From the output above we can observe the following:

All semantic roles contain a predicate and, if present, a list of arguments for that predicate.
Predicates include the token, lemma (inflected form) and position in the sentence.
Each argument will contain the type (see Output labels above), the tokens of that argument and the position of the tokens in the sentence.
All token positions in the sentence start from 1 (instead of 0, as lists in Python).

Wrap up

In this tutorial we described how to use the Semantic Role Labeler of the Codeq NLP API. The code below summarizes how to iterate over its output:

Copy to Clipboard

from codeq_nlp_api import CodeqClient

client = CodeqClient(user_id="USER_ID", user_key="USER_KEY")

pipe = [
    "semantic_roles"
]

text = "A pneumonia outbreak was reported in Wuhan, China in December 2019."

document = client.analyze(text, pipeline=pipe)

for sentence in document.sentences:
    raw_sentence = sentence.raw_sentence
    semantic_roles = sentence.semantic_roles

print("sentence: %s\n" % raw_sentence)
    print("semantic_roles:")
    for sr in semantic_roles:
        predicate_lemma = sr['predicate_lemma']
        predicate_token = sr['predicate_token']
        predicate_position = sr['predicate_position']
        print("")
        print("predicate_lemma: %s" % predicate_lemma)
        print("predicate_token: %s" % predicate_token)
        print("predicate_position: %s" % predicate_position)
        if 'arguments' in sr:
            print("arguments:")
            for arg in sr['arguments']:
                arg_type = arg['type']
                arg_tokens = arg['tokens']
                arg_tokens_position = arg['positions']
                print("- type: %s" % arg_type)
                print("- tokens: %s" % arg_tokens)
                print("- positions: %s\n" % arg_tokens_position)

Take a look at our documentation to learn more about the NLP tools we provide.

Do you need inspiration? Go to our use case demos and see how you can integrate different tools.

In our NLP demos section you can also try our tools and find examples of the output of each module.

Codeq NLP API Tutorial 8

Part 8. Semantic Roles

Codeq NLP API Documentation

Define a NLP pipeline and analyze a text

Semantic Role Labelling

Exploring CORD-19 with Codeq NLP API and Semantic Roles

Wrap up

Share This Story, Choose Your Platform!

Related Posts

Codeq’s Summarizer Updated with Summary Length Option

The ‘ncomp’ dependency label

Semantic Role Labeler Argument Categories