Accurate Online Speaker Diarization with Supervised Learning

Accurate Online Speaker Diarization with Supervised Learning

  • November 14, 2018
Table of Contents

Accurate Online Speaker Diarization with Supervised Learning

Speaker diarization, the process of partitioning an audio stream with multiple people into homogeneous segments associated with each individual, is an important part of speech recognition systems. By solving the problem of “who spoke when”, speaker diarization has applications in many important scenarios, such as understanding medical conversations, video captioning and more. However, training these systems with supervised learning methods is challenging — unlike standard supervised classification tasks, a robust diarization model requires the ability to associate new individuals with distinct speech segments that weren’t involved in training.

Importantly, this limits the quality of both online and offline diarization systems. Online systems usually suffer more, since they require diarization results in real time.

Source: googleblog.com

Tags :
Share :
comments powered by Disqus

Related Posts

Tensorflow 2.0: models migration and new design

Tensorflow 2.0: models migration and new design

Tensorflow 2.0 will be a major milestone for the most popular machine learning framework: lots of changes are coming, and all with the aim of making ML accessible to everyone. These changes, however, requires for the old users to completely re-learn how to use the framework: this article describes all the (known) differences between the 1.x and 2.x version, focusing on the change of mindset required and highlighting the pros and cons of the new and implementations. This article can be a good starting point also for the novice: start thinking in the Tensorflow 2.0 way right now, so you don’t have to re-learn a new framework (unless until Tensorflow 3.0 will be released).

Read More
Why Chinese Artificial Intelligence Will Run The World

Why Chinese Artificial Intelligence Will Run The World

With Chinese tech giants Baidu, Alibaba, and Tencent focused on developing sophisticated AI-driven systems in the coming decade, the rest of the world can only watch while China builds the computer systems that will run our world in the decades to come. If you’ve been paying attention in the past year, it seems that all anyone can talk about is the coming artificial intelligence boom on the horizon. Whether it’s the Amazon, Google, or Facebook, everyone seems to be getting in on the AI game as fast as they can.

Read More
Learning Concepts with Energy Functions

Learning Concepts with Energy Functions

We’ve developed an energy-based model that can quickly learn to identify and generate instances of concepts, such as near, above, between, closest, and furthest, expressed as sets of 2d points. Our model learns these concepts after only five demonstrations.

Read More