Text Classification: Best Practices for Real World Applications
Most text classification examples that you see on the Web or in books focus on demonstrating techniques. This will help you build a pseudo usable prototype. If you want to take your classifier to the next level and use it within a product or service workflow, then there are things you need to do from day one to make this a reality.
I’ve seen classifiers failing miserably and being replaced with off the shelf solutions because they don’t work in practice. Not only is money wasted on developing solutions that don’t go anywhere, the problem could have been avoided if enough thought was put into the process prior to development of these classifiers. In this article, I will highlight some of the best practices in building text classifiers that actually work for real world scenarios.
Some of these tips come from my personal experience in developing text classification solutions for different product problems. Some, come from literature that I’ve read and applied in practice.