Open Access
Recognition of Russian and Indian Sign Languages Based on Machine Learning
Author(s) -
Mikhail G. Grif,
R Elakkiya,
Alexey Prikhodko,
Maxim Bakaev,
E. Rajalakshmi
Publication year - 2021
Publication title -
sistemy analiza i obrabotki dannyh
Language(s) - English
Resource type - Journals
eISSN - 2782-215X
pISSN - 2782-2001
DOI - 10.17212/2782-2001-2021-3-53-74
Subject(s) - sign language , gesture , computer science , gesture recognition , sign (mathematics) , speech recognition , focus (optics) , artificial intelligence , variation (astronomy) , movement (music) , motion (physics) , natural language processing , linguistics , mathematics , mathematical analysis , philosophy , physics , astrophysics , optics , aesthetics
In the paper, we consider recognition of sign languages (SL) with a particular focus on Russian and Indian SLs. The proposed recognition system includes five components: configuration, orientation, localization, movement and non-manual markers. The analysis uses methods of recognition of individual gestures and continuous sign speech for Indian and Russian sign languages (RSL). To recognize individual gestures, the RSL Dataset was developed, which includes more than 35,000 files for over 1000 signs. Each sign was performed with 5 repetitions and at least by 5 deaf native speakers of the Russian Sign Language from Siberia. To isolate epenthesis for continuous RSL, 312 sentences with 5 repetitions were selected and recorded on video. Five types of movements were distinguished, namely, "No gesture", "There is a gesture", "Initial movement", "Transitional movement", "Final movement". The markup of sentences for highlighting epenthesis was carried out on the Supervisely.ly platform. A recurrent network architecture (LSTM) was built, implemented using the TensorFlow Keras machine learning library. The accuracy of correct recognition of epenthesis was 95 %. The work on a similar dataset for the recognition of both individual gestures and continuous Indian sign language (ISL) is continuing. To recognize hand gestures, the mediapipe holistic library module was used. It contains a group of trained neural network algorithms that allow obtaining the coordinates of the key points of the body, palms and face of a person in the image. The accuracy of 85 % was achieved for the verification data. In the future, it is necessary to significantly increase the amount of labeled data. To recognize non-manual components, a number of rules have been developed for certain movements in the face. These rules include positions for the eyes, eyelids, mouth, tongue, and head tilt.