Part 4. Abuse classifier

By Rodrigo Alarcón, Computational Linguist

In this tutorial we showcase the most recent module from Codeq’s NLP API, which identifies abusive and harmful content from texts.

This is a follow up tutorial; previous content can be found here:

The complete list of modules we offer can be found in our documentation:

Codeq NLP API Documentation

Define a NLP pipeline and analyze a text

As usual, the first thing to do is to create an instance of the Codeq Client and declare a pipeline containing the NLP annotators we are interested in. The client receives as input a text to be analyzed along with the declared pipeline. The output is a Document object that contains a list of Sentences; each sentence contains its own abuse predicted labels.

To print a quick overview of the results, you can use the method document.pretty_print(), which we will explain in detail in the following sections.

FULL DISCLAIMER: we do not endorse the content of the examples used here.

Copy to Clipboard

In this tutorial we are going to focus on two annotators: the abuse classifier and a tool to preprocess tweets. Specifically, we will describe:

  • the keyword (KEY) used to call each annotator,
  • the attribute (ATTR) where the output is stored,
  • the OUTPUT LABELS of the abuse classifier.

Abuse Classifier

The goal if this module is to automatically analyze texts that need to be reviewed for abusive and harmful language, mainly in the context of user-generated content or social communities.

More specific details about this annotator can be found here:

Detecting Abuse Online

  • KEY: abuse
  • ATTR: sentence.abuse

Output Labels

  • Offensive: sentences that contain profanity or that could be perceived as disrespectful.
  • Obscene/scatologic: sentences containing sexual content or references to bodily excretions.
  • Threatening/hostile: sentences that can be perceived as conveying a desire to inflict harm to somebody.
  • Insult: sentences containing insults.
  • Hate speech/racist: textual material that attacks a person or a group of people based on their actual or perceived race, ethnicity, nationality, religion, sex, gender identity, sexual orientation or disability, and that ultimately incites hatred and some type of violence against them.
  • Unknown abuse: Other type of abuse that can not be classified in one of the labels above.
  • Non-abusive
Copy to Clipboard

Twitter Preprocessor

This annotator can be used to preprocess and clean tweets, e.g., to remove artifacts like user mentions and URLs, segment hashtags into tokens and generate a list of clean words from a tweet. The output of this annotator is also stored at the Sentence level.

This module can improve the output of the abuse classifier, since some hashtags may contain important information for the detection of abusive content.

  • KEY: twitter_preprocess
  • ATTR: sentence.tokens_clean
Copy to Clipboard

Here we can see that the tags #WhiteLivesMatter and #whitepride are tokenized as “white lives matter” and “white pride” respectively.

Wrap Up

In this tutorial we described the Abuse classifier of the Codeq NLP API and a tool to clean tweets. The code below summarizes how to call the annotators explained here and access their output.

Copy to Clipboard

Take a look at our documentation to learn more about the NLP tools we provide.

Do you need inspiration? Go to our use case demos and see how you can integrate different tools.

In our NLP demos section you can also try our tools and find examples of the output of each module.