VulgarDetector – application to detect vulgar language in text

Foreword

Automatically recognition and flagged as spam comments with vulgar language – it’s possible? How implement application to take care your WordPress website and protect from vulgar comments?

Why this issue?

  • no similar solutions
  • get knowledge of develop WordPress plugin
  • get knowledge of microservices
  • get knowledge of use memcache
  • good introduction to artificial intelligence

The project consists of three parts

  1. Backend – REST API application build on the shoulders of Symfony 3 microframework.
  2. Frontend – simple static page (HTML, CSS, JS, JQuery) presents functionality of application
  3. WordPress plugin – checks comment based on backend application

How application recognize vulgar text

Checking the text is simple and is based on a dictionary of vulgar words, the whole process can be divided into five steps:

  1. Tokenization – process of breaking a stream of text up into words, phrases, symbols, or other meaningful elements called tokens
  2. Lowercase tokens – convert uppercase to lowercase
  3. Remove common stopwords – stopword is a commonly used word
  4. Remove duplicates
  5. Search tokens in database

Presentation solutions

  1. BACKEND
    Repository:
    https://github.com/tarnawski/vulgar-detector-api
    Staging:
    http://vulgardetector-api.ttarnawski.usermd.net/status
  2. FRONTEND
    Repository:
    https://github.com/tarnawski/vulgar-detector
    Staging:
    http://vulgardetector.ttarnawski.usermd.net/
  3. WORDPRESS PLUGIN
    Repository:
    https://github.com/tarnawski/vulgar-detector-plugin
    Wordpress Plugin Directory
    https://wordpress.org/plugins/vulgar-detector/

2 thoughts on “VulgarDetector – application to detect vulgar language in text”

  1. When someone writes an piece of writing he/she keeps
    the thought of a user in his/her brain that how a user can understand
    it. Therefore that’s why this piece of writing is great.

    Thanks!

Comments are closed.