z-logo
open-access-imgOpen Access
Lip-Corrector: Application of BERT-based model in sentence-level lipreading
Author(s) -
Haoran Zhao,
Bowen Zhang,
Zhanhang Yin
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1871/1/012146
Subject(s) - sentence , computer science , transformer , speech recognition , artificial intelligence , natural language processing , sequence (biology) , convolutional neural network , voltage , physics , quantum mechanics , biology , genetics
The current field of lipreading is limited to the processing of visual signal and the optimization of sequence models, but the sentence text is ignored. Aiming at this problem, we proposed a lipreading method combined with natural language processing (NLP) technology, Lip-Corrector, which applies the BERT model in this paper. The front end of the model uses 3D+2D convolutional neural network (CNN) to extract lip information, the middle end uses the Transformer-based Seq2seq sequence model to make sentence-level predictions, and the back end uses a sentence correction method based on the BERT model, which connects to the midend after pre-training on the self-made dataset. Experiments on the two largest sentence-level lipreading datasets of LRS2 and LRS3 show that the performance of this model surpasses all the baselines, which proves that lipreading methods combined with NLP technology will get better results.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here