DeltaVLAD: An efficient optimization algorithm to discriminate speaker embedding for text-independent speaker verification | Zendy

Xin Guo | Zendy; Chengfang Luo | Zendy; Aiwen Deng | Zendy; Feiqi Deng | Zendy

Open Access

DeltaVLAD: An efficient optimization algorithm to discriminate speaker embedding for text-independent speaker verification

Author(s) -

Xin Guo,

Chengfang Luo,

Aiwen Deng,

Feiqi Deng

Publication year - 2022

Publication title -

aims mathematics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.329

H-Index - 15

ISSN - 2473-6988

DOI - 10.3934/math.2022355

Subject(s) - softmax function , computer science , discriminative model , speech recognition , speaker recognition , embedding , speaker verification , speaker diarisation , pattern recognition (psychology) , subspace topology , frame (networking) , set (abstract data type) , coding (social sciences) , artificial intelligence , artificial neural network , mathematics , telecommunications , statistics , programming language

Text-independent speaker verification aims to determine whether two given utterances in open-set task originate from the same speaker or not. In this paper, some ways are explored to enhance the discrimination of embeddings in speaker verification. Firstly, difference is used in the coding layer to process speaker features to form the DeltaVLAD layer. The frame-level speaker representation is extracted by the deep neural network with differential operations to calculate the dynamic changes between frames, which is more conducive to capturing insignificant changes in the voiceprint. Meanwhile, NeXtVLAD is adopted to split the frame-level features into multiple word spaces before aggregating, and subsequently perform VLAD operations in each subspace, which can significantly reduce the number of parameters and improve performance. Secondly, the margin-based softmax loss function and the few-shot learning-based loss function are proposed to be combined for more discriminative speaker embeddings. Finally, for a fair comparison, the experimental results are performed on Voxceleb-1 showing superior performance of speaker verification system and can obtain new state-of-the-art results.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research