Nepali Speech Recognition using RNN-CTC Model | Zendy

Paribesh Regmi | Zendy; Arjun Dahal | Zendy; Basanta Joshi | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Nepali Speech Recognition using RNN-CTC Model

Author(s) -

Paribesh Regmi,

Arjun Dahal,

Basanta Joshi

Publication year - 2019

Publication title -

international journal of computer applications

Language(s) - English

Resource type - Journals

ISSN - 0975-8887

DOI - 10.5120/ijca2019918401

Subject(s) - nepali , computer science , speech recognition , artificial intelligence , linguistics , philosophy

This paper presents a Neural Network based Nepali Speech Recognition model. RNN (Recurrent Neural Networks) is used for processing sequential audio data. CTC (Connectionist Temporal Classification) [1] technique is applied allowing RNN to train over audio data. CTC is a probabilistic approach of maximizing the occurrence probability of the desired labels from RNN output. After processing through RNN and CTC layers, Nepali text is obtained as output. This paper also defines a character set of 67 Nepali characters required for transcription of Nepali speech to text.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research