The Building and Evaluation of a Mobile Parallel Multi-Dialect Speech Corpus for Arabic | Zendy

Khalid Almeman | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

The Building and Evaluation of a Mobile Parallel Multi-Dialect Speech Corpus for Arabic

Author(s) -

Khalid Almeman

Publication year - 2018

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2018.10.472

Subject(s) - computer science , arabic , sphinx , communication source , natural language processing , process (computing) , speech corpus , speech recognition , word (group theory) , artificial intelligence , crowd sourcing , speech synthesis , linguistics , world wide web , telecommunications , art , philosophy , visual arts , operating system

This paper discusses the process of building and evaluation a mobile parallel multi-dialect speech corpus for Arabic. The methodology for implementing the experiment is as follows: Two SIM cards were installed in two mobiles phones. One party is the sender and the other the receiver. Four different environments were chosen for the receiver, i.e. inside the home, in a moving car, in a public place and in a quiet place. By the end of the experiment, a new mobile parallel speech corpus for Arabic dialects was built. The newly obtained corpus provides us with the benefits of a large, fully parallel and labelled speech corpus without the necessity of a big effort for collection and building. The resultant corpus will be made freely available to researchers. To evaluate the resultant corpus, the CMU Sphinx recogniser extracted the word error rates (WERs) 24.3, 17.9, 31.2, 18.7 and 32.0 for multi-dialect, Levantine, Gulf, MSA and Egyptian, respectively.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research