Foreword
Automatically recognition and flagged as spam comments with vulgar language – it’s possible? How implement application to take care your WordPress website and protect from vulgar comments?
Why this issue?
- no similar solutions
- get knowledge of develop WordPress plugin
- get knowledge of microservices
- get knowledge of use memcache
- good introduction to artificial intelligence
The project consists of three parts
- Backend – REST API application build on the shoulders of Symfony 3 microframework.
- Frontend – simple static page (HTML, CSS, JS, JQuery) presents functionality of application
- WordPress plugin – checks comment based on backend application
How application recognize vulgar text
Checking the text is simple and is based on a dictionary of vulgar words, the whole process can be divided into five steps:
- Tokenization – process of breaking a stream of text up into words, phrases, symbols, or other meaningful elements called tokens
- Lowercase tokens – convert uppercase to lowercase
- Remove common stopwords – stopword is a commonly used word
- Remove duplicates
- Search tokens in database
Presentation solutions
- BACKEND
Repository:
https://github.com/tarnawski/vulgar-detector-api
Staging:
http://vulgardetector-api.ttarnawski.usermd.net/status - FRONTEND
Repository:
https://github.com/tarnawski/vulgar-detector
Staging:
http://vulgardetector.ttarnawski.usermd.net/ - WORDPRESS PLUGIN
Repository:
https://github.com/tarnawski/vulgar-detector-plugin
Wordpress Plugin Directory
https://wordpress.org/plugins/vulgar-detector/