Accurate Online Speaker Diarization with Supervised Learning

Accurate Online Speaker Diarization with Supervised Learning

  • November 14, 2018
Table of Contents

Accurate Online Speaker Diarization with Supervised Learning

Speaker diarization, the process of partitioning an audio stream with multiple people into homogeneous segments associated with each individual, is an important part of speech recognition systems. By solving the problem of “who spoke when”, speaker diarization has applications in many important scenarios, such as understanding medical conversations, video captioning and more. However, training these systems with supervised learning methods is challenging — unlike standard supervised classification tasks, a robust diarization model requires the ability to associate new individuals with distinct speech segments that weren’t involved in training.

Importantly, this limits the quality of both online and offline diarization systems. Online systems usually suffer more, since they require diarization results in real time.

Source: googleblog.com

Tags :
Share :
comments powered by Disqus

Related Posts

What’s the Best Deep Learning Framework?

What’s the Best Deep Learning Framework?

Deep learning models are large and complex, so instead of writing out every function from the ground up, programmers rely on frameworks and software libraries to build neural networks efficiently. The top deep learning frameworks provide highly optimized, GPU-enabled code that are specific to deep neural network computations.

Read More
Tensorflow 2.0: models migration and new design

Tensorflow 2.0: models migration and new design

Tensorflow 2.0 will be a major milestone for the most popular machine learning framework: lots of changes are coming, and all with the aim of making ML accessible to everyone. These changes, however, requires for the old users to completely re-learn how to use the framework: this article describes all the (known) differences between the 1.x and 2.x version, focusing on the change of mindset required and highlighting the pros and cons of the new and implementations. This article can be a good starting point also for the novice: start thinking in the Tensorflow 2.0 way right now, so you don’t have to re-learn a new framework (unless until Tensorflow 3.0 will be released).

Read More