z-logo
Premium
A revised method to detect erroneous characters wrongly substituted, deleted, and inserted at the end position in Japanese sentences and ‘bunsetsu’s
Author(s) -
Araki Chikahiro,
Mori Mikio,
Taniguchi Shuji
Publication year - 2011
Publication title -
ieej transactions on electrical and electronic engineering
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.254
H-Index - 30
eISSN - 1931-4981
pISSN - 1931-4973
DOI - 10.1002/tee.20640
Subject(s) - markov chain , position (finance) , computer science , artificial intelligence , hidden markov model , markov model , natural language processing , machine learning , finance , economics
A method to detect the erroneous characters wrongly substituted, deleted, and inserted at the interior location of Japanese sentences and ‘bunsetsu’s using m th‐order Markov chain model has been proposed earlier and was found to be useful in detecting these erroneous characters. However, with this method it is difficult to detect erroneous characters at the end position of Japanese sentences and ‘bunsetsu’s, because the Markov chain probabilities of erroneous characters at the end position of sentences and ‘bunsetsu’s, do not remain smaller than the critical value T the same number of times. This paper proposes a method to detect erroneous characters located at the end position of sentences and ‘bunsetsu’s using the ‘skipped Markov chain model’ in addition to the ‘connected Markov chain model’. From experiments with newspaper articles, the proposed method is shown to be useful to correct erroneous characters located at the end position of sentences and ‘bunsetsu’s. © 2011 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here