Detecting inappropriate messages on sensitive topics
NLP team of Skolkovo Institute of Science and Technology trained Neural Network which will help dialogue agents not to discuss "sensitive" topic in dangerous manner.
Problem: neural networks can not understand what they say by themselves. Being trained on large text datasets from social networks, they can trigger a disrespectful conversation or embarrass the chat participants, which can be crucial for a company's reputation.

Solution: a neural model that detects "inappropriate" statements related to these "sensitive" topics for further content filtering.
Image credit: Pavel Odinev / Skoltech
So far, 18 topics were defined as "sensitive", including drugs, pornography, politics, religion, suicide and crime. The main criterion of relevance is whether the statement can harm the human interlocutor or spoil the reputation of the chatbot owner. The data for training the neural network was taken from the largest bases of inappropriate messages in Russian language as Dvach and OtvetiMail.ru.


"We have shown that while the notions of topic sensitivity and message inappropriateness are rather subtle and rely on human intuition, they are nevertheless detectable by neural networks," study co-author Nikolay Babakov of Skoltech commented. "Our classifier correctly guessed which utterances the human labelers considered inappropriate in 89% of the cases."
Video of the Crowd Science Seminar led by Nikolay Babakov:
Official press-release:
https://www.skoltech.ru/en/2021/07/neural-model-se...

Published article about the research for deeper understanding: https://aclanthology.org/2021.bsnlp-1.4/

Pre-trained model for inappropriate utterances detection: https://huggingface.co/Skoltech/russian-inappropri...


Don't miss the next Crowd Science Seminar. Subscribe to our biweekly updates!