Text Classification as a Service
Let us imagine we have a service collecting unstructured textual data from our partners. We are collecting that data and building service directories out of it.How to keep our data clean and tidy without investing lots of money in expensive MDM platforms? We can use text classification services.
Text classification services using machine learning technologies keep track of incoming data and help categorize it in fully automatic way. They are using advances text matching algorithms to correlate and clean data.
Text Classification Engine |
TC Services Availability
You can have your our own text classification service on demand. Service will be delivered via PubNub queue. It can be started up in minutes and serve your needs just as long as you wish.
Services are built on top of PredictionIO technology and are using PubNub queues as a transport medium. Core components of the engines in most cases are open source. They are in form of templates developed by growing community of PredictionIO developers.
Text Classification Engines are running as Docker containers. This technology allows to create new instances of engines just in minutes in any environment running Docker service. I means they can run in AWS cloud, locally in your back-end servers or even on you laptop running Linux VM.
What do you need to have your own Text Classification Service?
- You need your own PubNub queue. PubNub queues are free below 1 million of messages sent. See my previous blog Recommendation Engine in Docker Container. You can find there instructions how to setup a PubNub queue.
- Use goliasz/tcaas-micro Docker image to spin up you own container or ask for help (KOLIBERO).
Conclusions
- Text classification services are somewhere out there in the cloud. But they can be yours with very little effort.
- You don't need your hardware to get text classified. You can just order an classification engine for you and use it using PubNub queues.
- Such distributed services scale together with business growth. Cloud does not have borders and limits and you can have as many engines as you can imagine.
Resources
- goliasz/tcaas Docker Image
- TCaaS Description
- TCaaS Demo
- goliasz/pio-template-text-similarity PredictionIO Template
- PubNub
- PredictionIO