Part 9. Semantic Similarity
By Rodrigo Alarcón, Computational Linguist
In this tutorial we will showcase a module of Codeq’s NLP API that can be used to analyze the semantic similarity between texts. Previous tutorials of this series can be found here:
- Part 1. Getting started and sending requests to the API.
- Part 2. Calling NLP annotators for linguistic analysis.
- Part 3. Using text classifiers to classify sentiments and emotions, and detect sarcasm in texts.
- Part 4. Detecting abusive and harmful content from texts.
- Part 5. Extracting and disambiguating named entities.
- Part 6. Extracting Speech Acts, Questions & Tasks.
- Part 7. Summarizing texts.
- Part 8. Extracting Semantic Roles.
The complete list of modules we offer can be found in our documentation:
Codeq NLP API Documentation
Calling the Semantic Similarity endpoint
The endpoint to get the semantic similarity between texts can also be called using an instance of our Python SDK. As usual, to create an instance of this client, you need to use your API credentials as input parameters.
Instead of defining a pipeline with the names of some NLP Annotators, as we have been doing in previous tutorials, in this case you need to use a different method of the client to get the similarity between texts:
This method requires as input two strings and returns as output a dict containing the text_similarity_score:
The similarity score indicates the semantic relatedness between the input texts, expressed in the range of 1 to 5, where 1 means highly non-related and 5 means highly related:
- 5 – The two sentences are completely equivalent, as they mean the same thing.
- 4 – The two sentences are mostly equivalent, but some unimportant details differ.
- 3 – The two sentences are roughly equivalent, but some important information differs, or is missing from one or the other.
- 2 – The two sentences are not equivalent, but share some details or are on the same topic.
- 1 – The two sentences are completely dissimilar.
Wrap up
In this tutorial we described how to use the Semantic Similarity endpoint of the Codeq NLP API. The code below summarizes how to iterate over its output:
Take a look at our documentation to learn more about the NLP tools we provide.
Do you need inspiration? Go to our use case demos and see how you can integrate different tools.
In our NLP demos section you can also try our tools and find examples of the output of each module.