By David Schueler, Computational Linguist

Semantic Role Labeler Argument Categories

As I discuss in this article, we recently decided to undergo some major revisions to the corpora we use to train some of our NLP tools. Specifically, we applied major corrections to the corpora for part-of-speech tagging and for dependency parsing; for the latter, see also this article.

For our semantic role labeler, we did a less aggressive revision of the data. Our concern here was more the viability of the set of categories available to label predicate/argument relations, hereafter called “argument types”, rather than the accuracy and consistency of the labeled data per se. To that end, we eliminated some of the argument types present in the original annotated corpus, whose categories correspond roughly to those given in Bonial et al. 2010.

The predicate/argument pairs that were labeled with those labels we eliminated were re-labeled to other existing labels. In a few cases this was done by hand, by inspecting individual sentences to see which new label made the most sense. In most other cases, they were automatically re-labeled based on their original label, via a merger rule, such that all predicate/argument pairs originally labeled with argument type X will now be re-labeled to argument type Y.

From this process, we arrived at a new reduced set of argument types. We then revised the names we give to those remaining types, to better reflect the semantic range that they cover. Since some of the argument types absorbed data from predicate/argument pairs originally labeled differently, the semantic range for some of the types expanded, requiring a different name for the type.

Therefore, in this article, I give an exhaustive list of the currently supported argument types which result from this revision. I give the name of each type, a brief description, which may include a few different semantic criteria, and one or more examples which exemplify some of those criteria.

In the example sentences, the predicate is italicized, while the argument is given in boldface. The sentences are given as tokenized strings.

Argument type: Agent/Experiencer

(1) People send me emails all of the time .

(2) Outdoors , they can also personally experience wetland ecology .

Argument type: Patient/Theme/Affected

(3) People send me emails all of the time .

(4) He also said investment by businesses is falling off .

(5) Tom Mintier has details .

(6) Well that ’s a deal .

Argument type: Beneficiary/Goal/Predicate/Comitative

(7) People send me emails all of the time .

(8) Well that ’s a deal .

(9) So he went in to stay with them .

Argument type: Destination/EndingPoint/Source

(10) the first night we went over to the Inner Harbor

(11) These figures come to us from Abu Dhabi television .

(12) Some estimates have gone as high as 80,000 members .

(13) They will break them to pieces like clay pots .

Argument type: Location

(14) Outdoors , they can also personally experience wetland ecology .

(15) There ’s nothing to report on that front .

Argument type: Speaker/Addressee/Conjunction/Interjection

(16) Jim : I like the graphic ,

(17) is that what it is Alex ?

(18) But they allowed me to drive on .

(19) He also said investment by businesses is falling off .

(20) Well that ’s a deal .

(21) oh they do not remember your brother .

Argument type: Manner/Means/Extent

(22) Outdoors , they can also personally experience wetland ecology .

(23) You have made yourselves pure by obeying the truth .

(24) Demas loved this world too much .

Argument type: Modal

(25) Outdoors , they can also personally experience wetland ecology .

Argument type: Cause

(26) But they gave much because of their great joy .

Argument type: Temporal

(27) People send me emails all of the time .

Argument type: EventModifier/Purpose

(28) Well Republicans certainly think so .

(29) You probably knew that .

(30) So you still have your office as your office ?

(31) I called Atlanta to get background on this guy .

Argument type: Negative

(32) These are not certain , either .

(33) There the fire never stops .

We believe that this set of argument/predicate relation categories effectively captures the distinctions that our semantic role labeler makes. We hope this guide is a useful reference to using the tool.

References