Accurate Online Speaker Diarization with Supervised Learning

Accurate Online Speaker Diarization with Supervised Learning

  • November 14, 2018
Table of Contents

Accurate Online Speaker Diarization with Supervised Learning

Speaker diarization, the process of partitioning an audio stream with multiple people into homogeneous segments associated with each individual, is an important part of speech recognition systems. By solving the problem of “who spoke when”, speaker diarization has applications in many important scenarios, such as understanding medical conversations, video captioning and more. However, training these systems with supervised learning methods is challenging — unlike standard supervised classification tasks, a robust diarization model requires the ability to associate new individuals with distinct speech segments that weren’t involved in training.

Importantly, this limits the quality of both online and offline diarization systems. Online systems usually suffer more, since they require diarization results in real time.

Source: googleblog.com

Tags :
Share :
comments powered by Disqus

Related Posts

Horizon: An open-source reinforcement learning platform

Horizon: An open-source reinforcement learning platform

Horizon is the first open source end-to-end platform that uses applied reinforcement learning (RL) to optimize systems in large-scale production environments. The workflows and algorithms included in this release were built on open frameworks — PyTorch 1.0, Caffe2, and Spark — making Horizon accessible to anyone using RL at scale. We’ve put Horizon to work internally over the past year in a wide range of applications, including helping to personalize M suggestions, delivering more meaningful notifications, and optimizing streaming video quality.

Read More
A Google Brain engineer’s guide to entering AI

A Google Brain engineer’s guide to entering AI

Note that this guide was written in November 2018 to complement an in-depth conversation on the 80,000 Hours Podcast with Catherine Olsson and Daniel Ziegler on how to transition from computer science and software engineering in general into ML engineering, with a focus on alignment and safety. If you like this guide, we’d strongly encourage you to check out the podcast episode where we discuss some of the instructions here, and other relevant advice. Technical AI safety is a multifaceted area of research, with many sub-questions in areas such as reward learning, robustness, and interpretability.

Read More
Why Chinese Artificial Intelligence Will Run The World

Why Chinese Artificial Intelligence Will Run The World

With Chinese tech giants Baidu, Alibaba, and Tencent focused on developing sophisticated AI-driven systems in the coming decade, the rest of the world can only watch while China builds the computer systems that will run our world in the decades to come. If you’ve been paying attention in the past year, it seems that all anyone can talk about is the coming artificial intelligence boom on the horizon. Whether it’s the Amazon, Google, or Facebook, everyone seems to be getting in on the AI game as fast as they can.

Read More