Introducing LCA: Loss Change Allocation for Neural Network Training

Introducing LCA: Loss Change Allocation for Neural Network Training

  • September 14, 2019
Table of Contents

Introducing LCA: Loss Change Allocation for Neural Network Training

Neural networks (NNs) have become prolific over the last decade and now power machine learning across the industry. At Uber, we use NNs for a variety of purposes, including detecting and predicting object motion for self-driving vehicles, responding more quickly to customers, and building better maps. While many NNs perform quite well at their tasks, networks are fundamentally complex systems, and their training and operation is still poorly understood.

For this reason, efforts to better understand network properties and model predictions are ongoing, both at Uber and across the broader scientific community. Although prior studies have analyzed the network training process, it still largely remains a black box: millions of parameters are adjusted via simple rules during training, but our view into the process itself remains limited to a scalar loss quantity, which provides a severely restricted view into a rich and high-dimensional process. For example, it may be that one part of a network is performing all of the learning and another part is useless, but simply observing the loss will never reveal this.

In our paper, LCA: Loss Change Allocation for Neural Network Training, to be presented at NeurIPS 2019, we propose a method called Loss Change Allocation (LCA) that provides a rich window into the neural network training process. LCA allocates changes in loss over individual parameters, thereby measuring how much each parameter learns. Using LCA, we present three interesting observations about neural networks regarding noise, layer contributions, and layer synchronization.

Fellow researchers and practitioners are invited to use our code to try this approach on their own networks.

Source: uber.com

Tags :
Share :
comments powered by Disqus

Related Posts

Mapping roads through deep learning and weakly supervised training

Mapping roads through deep learning and weakly supervised training

Creating accurate maps today is a painstaking, time-consuming manual process, even with access to satellite imagery and mapping software. Many regions — particularly in the developing world — remain largely unmapped. To help close this gap, Facebook AI researchers and engineers have developed a new method that uses deep learning and weakly supervised training to predict road networks from commercially available high-resolution satellite imagery.

Read More
Teaching Computers to Answer Complex Questions

Teaching Computers to Answer Complex Questions

Computerized question-answering systems usually take one of two approaches. Either they do a text search and try to infer the semantic relationships between entities named in the text, or they explore a hand-curated knowledge graph, a data structure that directly encodes relationships among entities. With complex questions, however — such as “Which Nolan films won an Oscar but missed a Golden Globe?” — both of these approaches run into difficulties.

Read More