Five Lessons From the First Three Years of Michelangelo

Five Lessons From the First Three Years of Michelangelo

  • November 14, 2018
Table of Contents

Five Lessons From the First Three Years of Michelangelo

Uber has been one of the most active contributors to open source machine learning technologies in the last few years. While companies like Google or Facebook have focused their contributions in new deep learning stacks like TensorFlow, Caffe2 or PyTorch, the Uber engineering team has really focused on tools and best practices for building machine learning at scale in the real world. Technologies such as Michelangelo, Horovod, PyML, Pyro are some of examples of Uber’s contributions to the machine learning ecosystem.

With only a small group of companies developing large scale machine learning solutions, the lessons and guidance from Uber becomes even more valuable for machine learning practitioners (I certainly learned a lot and have regularly written about Uber’s efforts). Recently, the Uber engineering team published an evaluation of the first three years of operations of the Michelangelo platform. If we remove all the Michelangelo specifics, Uber’s post contains a few non-obvious, valuable lessons for organizations starting in their machine learning journey.

I am going to try to summarize some of those key takeaways in a more generic way that can be applicable to any mainstream machine learning scenario.

Source: towardsdatascience.com

Tags :
Share :
comments powered by Disqus

Related Posts

EPO Issues First Guidelines on AI Patents

EPO Issues First Guidelines on AI Patents

The European Patent Office (EPO) has issued official guidelines on the patenting of artificial intelligence and machine learning technologies. The guidelines became valid on November 1st, 2018. When determining whether the claimed subject-matter satisfies this condition, the guidelines note that expressions such as “support vector machine,” “reasoning engine” or “neural network” may not qualify, as these are regarded as terms for mathematical methods which do not have a unique technical character of their own.

Read More
Accurate Online Speaker Diarization with Supervised Learning

Accurate Online Speaker Diarization with Supervised Learning

Speaker diarization, the process of partitioning an audio stream with multiple people into homogeneous segments associated with each individual, is an important part of speech recognition systems. By solving the problem of “who spoke when”, speaker diarization has applications in many important scenarios, such as understanding medical conversations, video captioning and more. However, training these systems with supervised learning methods is challenging — unlike standard supervised classification tasks, a robust diarization model requires the ability to associate new individuals with distinct speech segments that weren’t involved in training.

Read More
Tensorflow 2.0: models migration and new design

Tensorflow 2.0: models migration and new design

Tensorflow 2.0 will be a major milestone for the most popular machine learning framework: lots of changes are coming, and all with the aim of making ML accessible to everyone. These changes, however, requires for the old users to completely re-learn how to use the framework: this article describes all the (known) differences between the 1.x and 2.x version, focusing on the change of mindset required and highlighting the pros and cons of the new and implementations. This article can be a good starting point also for the novice: start thinking in the Tensorflow 2.0 way right now, so you don’t have to re-learn a new framework (unless until Tensorflow 3.0 will be released).

Read More