AN OVERVIEW OF THE RECOGNITION ALGORITHM OF A HUMAN VOICE
Author(s) -
Оrken Mamyrbayev,
O. Mamyrbayev
Publication year - 2021
Publication title -
news of the national academy of sciences of the republic of kazakhstan
Language(s) - English
Resource type - Journals
eISSN - 2518-1726
pISSN - 1991-346X
DOI - 10.32014/2021.2224-5294.5
Subject(s) - computer science , phone , class (philosophy) , gaussian , speech recognition , focus (optics) , field (mathematics) , mixture model , distortion (music) , human voice , sorting , speaker recognition , artificial intelligence , machine learning , algorithm , pattern recognition (psychology) , amplifier , computer network , philosophy , linguistics , physics , mathematics , bandwidth (computing) , quantum mechanics , pure mathematics , optics
Speech recognition has various applications, including human-machine interaction, sorting phone calls by gender classification, categorizing videos with tags, and so on. Currently, machine learning is a popular field that is widely used in various fields and applications, taking advantage of the latest developments in digital technologies and the advantages of data storage capabilities from electronic media. In this article, we will focus on voice gender recognition for a class of text-dependent systems using the Dynamic time distortion (DTW) algorithm and for a class of text-independent systems, the Gaussian mixture model. With this method, it is possible to distinguish a person's voice with the highest accuracy, since the components of Gaussian mixtures can simulate the personality of the voice. The article presents the results of testing the algorithm, and concludes that the Gaussian mixture model is applicable to solving the problem of identifying a person by voice.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom