A Hybrid Unsupervised Segmentation Algorithm for Arabic Speech Using Feature Fusion and a Genetic Algorithm (July 2018) | Zendy

Ahmed Hamdi Abo Absa | Zendy; Mohamed Deriche | Zendy; Moustafa Elshafei-Ahmed | Zendy; Yahya Mohamed Elhadj | Zendy; Biing-Hwang Juang | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A Hybrid Unsupervised Segmentation Algorithm for Arabic Speech Using Feature Fusion and a Genetic Algorithm (July 2018)

Author(s) -

Ahmed Hamdi Abo Absa,

Mohamed Deriche,

Moustafa Elshafei-Ahmed,

Yahya Mohamed Elhadj,

Biing-Hwang Juang

Publication year - 2018

Publication title -

ieee access

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.587

H-Index - 127

ISSN - 2169-3536

DOI - 10.1109/access.2018.2859631

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

Speech is the most natural form of human communication. Major achievements have been made in developing systems that automatically recognize human speech and respond to it. An important preprocessing step in speech recognition systems (and other applications) is segmentation. Such a step is important in identifying the high-level semantics of speech sounds, including syllables, consonants, vowels ..., and so on. There are basically two general approaches used for speech segmentation, namely, explicit and implicit (so-called with and without linguistic reference). Explicit segmentation uses a bottom-up process based on the concept of fixed-size frames. Such a framework is usually used on conjunction with hidden Markov models. The varying frame size or sample-by-sample approaches are mainly used in implicit segmentation techniques, which are based on the detection of spectral distortions. The main objective of this paper is to develop a novel speech segmentation algorithm for Arabic language. In this application, we focus on the accurate segmentation of Quran recitation. The proposed system starts with a set of initial segmentations using three basic speech features: entropy, zero crossings, and energy. The segmentation results obtained are then fused at the output level using a genetic algorithm-based optimization scheme. Together with the segmentation results, we also introduce the concept of speech units to model the fundamental entities representing Quran recitation. Our results show an enhanced performance in segmentation of about 20% over that of traditional single-feature-based segmentation techniques.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research