Updating Neural Networks to Recognize New Categories, With Minimal Retraining
Many of today’s most popular AI systems are, at their core, classifiers. They classify inputs into different categories: this image is a picture of a dog, not a cat; this audio signal is an instance of the word “Boston”, not the word “Seattle”; this sentence is a request to play a video, not a song. But what happens if you need to add a new class to your classifier — if, say, someone releases a new type of automated household appliance that your smart-home system needs to be able to control?
The traditional approach to updating a classifier is to acquire a lot of training data for the new class, add it to all the data used to train the classifier initially, and train a new classifier on the combined data set. With today’s commercial AI systems, many of which were trained on millions of examples, this is a laborious process. This week, at the 33rd conference of the Association for the Advancement of Artificial Intelligence (AAAI), my colleague Lingzhen Chen from the University of Trento and I are presenting a paper on techniques for updating a classifier using only training data for the new class.
As an example application, we consider a neural network that has been trained to identify people and organizations in online news articles. We show that it is possible to transfer that network and its learned parameters into a new network trained to identify an additional type of named entity — locations.