Introducing state of the art text classification with universal language models

Introducing state of the art text classification with universal language models

  • May 15, 2018
Table of Contents

Introducing state of the art text classification with universal language models

This post is a lay-person’s introduction to our new paper, which shows how to classify documents automatically with both higher accuracy and less data requirements than previous approaches. We’ll explain in simple terms: natural language processing; text classification; transfer learning; language modeling; and how our approach brings these ideas together. If you’re already familar with NLP and deep learning, you’ll probably want to jump over to our NLP classification page for technical links.

Today we’re releasing our paper Universal Language Model Fine-tuning for Text Classification (ULMFiT), pre-trained models, and full source code in the Python programming language. The paper has been peer-reviewed and accepted for presentation at the Annual Meeting of the Association for Computational Linguistics (ACL 2018). For links to videos providing an in-depth walk-through of the approach, all the Python modules used, pre-trained models, and scripts for building your own models, see our NLP classification page.

This method dramatically improves over previous approaches to text classification, and the code and pre-trained models allow anyone to leverage this new approach to better solve problems such as: Finding documents relevant to a legal case; Identifying spam, bots, and offensive comments; Classifying positive and negative reviews of a product; Grouping articles by political orientation; …and much more.

Source: fast.ai

Tags :
Share :
comments powered by Disqus

Related Posts

How an AI Startup Could Defeat Now Unbeatable Bugs

How an AI Startup Could Defeat Now Unbeatable Bugs

The need for new medications is higher than ever, but so is the cost and time to bring them to market. Developing a new drug can cost billions and take as long as 14 years, according to the U.S. Food and Drug Administration. Yet with all that effort, only 8 percent of drugs make it to market, the FDA said.

Read More
Machine Learning for Text Classification Using SpaCy in Python

Machine Learning for Text Classification Using SpaCy in Python

spaCy is a popular and easy-to-use natural language processing library in Python. It provides current state-of-the-art accuracy and speed levels, and has an active open source community. However, since SpaCy is a relative new NLP library, and it’s not as widely adopted as NLTK.

Read More
Prefrontal cortex as a meta-reinforcement learning system

Prefrontal cortex as a meta-reinforcement learning system

Recently, AI systems have mastered a range of video-games such as Atari classics Breakout and Pong. But as impressive as this performance is, AI still relies on the equivalent of thousands of hours of gameplay to reach and surpass the performance of human video game players. In contrast, we can usually grasp the basics of a video game we have never played before in a matter of minutes.

Read More