Premium
Discrimination of “hot potato voice” caused by upper airway obstruction utilizing a support vector machine
Author(s) -
Fujimura Shintaro,
Kojima Tsuyoshi,
Okanoue Yusuke,
Shoji Kazuhiko,
Inoue Masato,
Hori Ryusuke
Publication year - 2019
Publication title -
the laryngoscope
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.181
H-Index - 148
eISSN - 1531-4995
pISSN - 0023-852X
DOI - 10.1002/lary.27584
Subject(s) - formant , support vector machine , vocal tract , speech recognition , computer science , airway , phonation , classifier (uml) , cepstrum , epiglottis , vocal folds , pattern recognition (psychology) , artificial intelligence , acoustics , larynx , vowel , medicine , audiology , physics , surgery
Objectives/Hypothesis “Hot potato voice” (HPV) is a thick, muffled voice caused by pharyngeal or laryngeal diseases characterized by severe upper airway obstruction, including acute epiglottitis and peritonsillitis. To develop a method for determining upper‐airway emergency based on this important vocal feature, we investigated the acoustic characteristics of HPV using a physical, articulatory speech synthesis model. The results of the simulation were then applied to design a computerized recognition framework using a mel‐frequency cepstral coefficient domain support vector machine (SVM). Study Design Quasi‐experimental research design. Methods Changes in the voice spectral envelope caused by upper airway obstructions were analyzed using a hybrid time‐frequency model of articulatory speech synthesis. We evaluated variations in the formant structure and thresholds of critical vocal tract area functions that triggered HPV. The SVMs were trained using a dataset of 2,200 synthetic voice samples generated by an articulatory synthesizer. Voice classification experiments on test datasets of real patient voices were then performed. Results On phonation of the Japanese vowel /e/, the frequency of the second formant fell and coalesced with that of the first formant as the area function of the oropharynx decreased. Changes in higher‐order formants varied according to constriction location. The highest accuracy afforded by the SVM classifier trained with synthetic data was 88.3%. Conclusions HPV caused by upper airway obstruction has a highly characteristic spectral envelope. Based on this distinctive voice feature, our SVM classifier, who was trained using synthetic data, was able to diagnose upper‐airway obstructions with a high degree of accuracy. Level of Evidence 2c Laryngoscope , 129:1301–1307, 2019