z-logo
open-access-imgOpen Access
A Triplet Ranking-Based Neural Network for Speaker Diarization and Linking
Author(s) -
Gaël Le Lan,
Delphine Charlet,
Anthony Larcher,
Sylvain Meignier
Publication year - 2017
Publication title -
interspeech 2022
Language(s) - English
Resource type - Conference proceedings
DOI - 10.21437/interspeech.2017-270
Subject(s) - speaker diarisation , cosine similarity , computer science , artificial neural network , similarity (geometry) , speech recognition , artificial intelligence , ranking (information retrieval) , linear discriminant analysis , speaker recognition , speaker verification , probabilistic logic , word error rate , pattern recognition (psychology) , image (mathematics)
This paper investigates a novel neural scoring method, based on conventional i-vectors, to perform speaker diarization and linking of large collections of recordings. Using triplet loss for training, the network projects i-vectors in a space that better separates speakers in terms of cosine similarity. Experiments are run on two French TV collections built from REPERE [1] and ETAPE [2] campaigns corpora, the system being trained on French Radio data. Results indicate that the proposed approach outperforms conventional cosine and Probabilistic Linear Discriminant Analysis scoring methods on both within-and cross-recording diarization tasks, with a Diarization Error Rate reduction of 14% in average.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom