Introducing LCA: Loss Change Allocation for Neural Network Training

Introducing LCA: Loss Change Allocation for Neural Network Training

  • September 14, 2019
Table of Contents

Introducing LCA: Loss Change Allocation for Neural Network Training

Neural networks (NNs) have become prolific over the last decade and now power machine learning across the industry. At Uber, we use NNs for a variety of purposes, including detecting and predicting object motion for self-driving vehicles, responding more quickly to customers, and building better maps. While many NNs perform quite well at their tasks, networks are fundamentally complex systems, and their training and operation is still poorly understood.

For this reason, efforts to better understand network properties and model predictions are ongoing, both at Uber and across the broader scientific community. Although prior studies have analyzed the network training process, it still largely remains a black box: millions of parameters are adjusted via simple rules during training, but our view into the process itself remains limited to a scalar loss quantity, which provides a severely restricted view into a rich and high-dimensional process. For example, it may be that one part of a network is performing all of the learning and another part is useless, but simply observing the loss will never reveal this.

In our paper, LCA: Loss Change Allocation for Neural Network Training, to be presented at NeurIPS 2019, we propose a method called Loss Change Allocation (LCA) that provides a rich window into the neural network training process. LCA allocates changes in loss over individual parameters, thereby measuring how much each parameter learns. Using LCA, we present three interesting observations about neural networks regarding noise, layer contributions, and layer synchronization.

Fellow researchers and practitioners are invited to use our code to try this approach on their own networks.

Source: uber.com

Tags :
Share :
comments powered by Disqus

Related Posts

Teaching Computers to Answer Complex Questions

Teaching Computers to Answer Complex Questions

Computerized question-answering systems usually take one of two approaches. Either they do a text search and try to infer the semantic relationships between entities named in the text, or they explore a hand-curated knowledge graph, a data structure that directly encodes relationships among entities. With complex questions, however — such as “Which Nolan films won an Oscar but missed a Golden Globe?” — both of these approaches run into difficulties.

Read More
Replay in biological and artificial neural networks

Replay in biological and artificial neural networks

Our waking and sleeping lives are punctuated by fragments of recalled memories: a sudden connection in the shower between seemingly disparate thoughts, or an ill-fated choice decades ago that haunts us as we struggle to fall asleep. By measuring memory retrieval directly in the brain, neuroscientists have noticed something remarkable: spontaneous recollections, measured directly in the brain, often occur as very fast sequences of multiple memories. These so-called ‘replay’ sequences play out in a fraction of a second–so fast that we’re not necessarily aware of the sequence.

Read More
Speak to me: How voice commerce is revolutionizing commerce

Speak to me: How voice commerce is revolutionizing commerce

We’ve seen profound advances in technology, especially with the development of artificial intelligence and deep learning which are increasingly for voice assistants. This, in turn, promises to bring about huge changes in consumer behavior — what’s being called “voice commerce”. This is a new channel, governed by a new set of rules.

Read More