
Single‐signal entity approach for sung word recognition with artificial neural network and time–frequency audio features
Author(s) -
Khunarsa Peerapol
Publication year - 2017
Publication title -
the journal of engineering
Language(s) - English
Resource type - Journals
ISSN - 2051-3305
DOI - 10.1049/joe.2017.0210
Subject(s) - speech recognition , computer science , artificial neural network , signal (programming language) , word (group theory) , audio signal , artificial intelligence , pattern recognition (psychology) , speech coding , mathematics , programming language , geometry
Singing voice recognition is very different from speech recognition or automatic speech recognition because there are distinct differences between speaking and singing voices. The problem is complex because music audio signals with their background instrumental accompaniments are regarded as noise sources that degrade the performance of the recognition system. This study proposes a statistical learning method to recognise words in a vocal audio signal with background music and to classify the region of a singing voice in a polyphonic audio signal. The goal of this study is to solve the problem of recognising words from sung input without using any method to separate instrumental from the background. This study also applies a concept from image recognition by using a spectrogram feature as an image to solve the problem. An audio signal with accompanying music was analysed and transformed into a spectrogram feature. To recognise it, the entire spectrogram feature was sliced, forming a feature vector for a feed‐forward neural network classifier. Several classification functions were compared, including K ‐Nearest Neighbour, Fisher Linear Classifier, Linear Bayes Normal Classifier, Naive Bayes Classifier, Parzen Classifier and Decision Tree. The results show that using a feed‐forward neural network can effectively recognise sung words at an accuracy rate of more than 93.0%. In particular, this system can recognise cross‐language music data.