z-logo
open-access-imgOpen Access
Vocal Data Assesment To Envision Distinctive Features of An Individual
Author(s) -
Ashutosh Garg,
Kushal Agrawal,
Mrs. P. Akilandeshwari
Publication year - 2020
Publication title -
international journal of innovative technology and exploring engineering
Language(s) - English
Resource type - Journals
ISSN - 2278-3075
DOI - 10.35940/ijitee.f3771.049620
Subject(s) - mel frequency cepstrum , computer science , speech recognition , centroid , support vector machine , feature extraction , artificial intelligence , pattern recognition (psychology) , speech processing , bandwidth (computing) , computer network
There is a lot of audio data generated on a day to day bases, which goes to waste without undergoing due processing. If we process this data, it can be beneficial for a multitude of purposes. Vocal data is unstructured, which makes it even harder for processing. This data has to undergo thorough pre-processing to convert it to a machine-understandable form. We aim to perform analysis of human voice to extract meaningful data and make a prediction of their age, gender, and accent. The developed system uses the Mel-frequency Cepstral Coefficient (MFCC), zero-cross-rate(ZCR), chroma_stft, spectral_centroid, spectral_bandwidth, and spectral_rolloff algorithms as a tool for Feature Extraction. The algorithms used for making inferences are support vector machine (SVM), K-nearest neighbors, and SVR. The work can be extended even further by combining video data with the audio data for analysis. The system can also be improved by increasing the number of languages it can detect.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here