Improving Language Understanding with Unsupervised Learning

Improving Language Understanding with Unsupervised Learning

  • June 13, 2018
Table of Contents

Improving Language Understanding with Unsupervised Learning

We’ve obtained state-of-the-art results on a suite of diverse language tasks with a scalable, task-agnostic system, which we’re also releasing. Our approach is a combination of two existing ideas: transformers and unsupervised pre-training. These results provide a convincing example that pairing supervised learning methods with unsupervised pre-training works very well; this is an idea that many have explored in the past, and we hope our result motivates further research into applying this idea on larger and more diverse datasets.

Our system works in two stages; first we train a transformer model on a very large amount of data in an unsupervised manner — using language modeling as a training signal — then we fine-tune this model on much smaller supervised datasets to help it solve specific tasks. We developed this approach following our sentiment neuron work, in which we noted that unsupervised learning techniques can yield surprisingly discriminative features when trained on enough data. Here, we wanted to further explore this idea: can we develop one model, train it in an unsupervised way on a large amount of data, and then fine-tune the model to achieve good performance on many different tasks?

Our results indicate that this approach works surprisingly well; the same core model can be fine-tuned for very different tasks with minimal adaptation.

Source: openai.com

Tags :
Share :
comments powered by Disqus

Related Posts

Training a neural network in phase-change memory beats GPUs

Training a neural network in phase-change memory beats GPUs

Compared to a typical CPU, a brain is remarkably energy-efficient, in part because it combines memory, communications, and processing in a single execution unit, the neuron. A brain also has lots of them, which lets it handle lots of tasks in parallel. Attempts to run neural networks on traditional CPUs run up against these fundamental mismatches.

Read More
Attacks against machine learning – an overview

Attacks against machine learning – an overview

At a high level, attacks against classifiers can be broken down into three types: Adversarial inputs, which are specially crafted inputs that have been developed with the aim of being reliably misclassified in order to evade detection. Adversarial inputs include malicious documents designed to evade antivirus, and emails attempting to evade spam filters. Data poisoning attacks, which involve feeding training adversarial data to the classifier.

Read More
AI winter is well on its way

AI winter is well on its way

Deep learning has been at the forefront of the so called AI revolution for quite a few years now, and many people had believed that it is the silver bullet that will take us to the world of wonders of technological singularity (general AI). Many bets were made in 2014, 2015 and 2016 when still new boundaries were pushed, such as the Alpha Go etc. Companies such as Tesla were announcing through the mouths of their CEO’s that fully self driving car was very close, to the point that Tesla even started selling that option to customers [to be enabled by future software update].

Read More