Premium
Metadata records machine translation combining multi‐engine outputs with limited parallel data
Author(s) -
Reyes Ayala Brenda,
Knudson Ryan,
Chen Jiangping,
Cao Gaohui,
Wang Xinyue
Publication year - 2018
Publication title -
journal of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.903
H-Index - 145
eISSN - 2330-1643
pISSN - 2330-1635
DOI - 10.1002/asi.23925
Subject(s) - metadata , computer science , machine translation , information retrieval , fluency , world wide web , digital library , natural language processing , artificial intelligence , philosophy , linguistics , art , literature , poetry
One way to facilitate Multilingual Information Access (MLIA) for digital libraries is to generate multilingual metadata records by applying Machine Translation (MT) techniques. Current online MT services are available and affordable, but are not always effective for creating multilingual metadata records. In this study, we implemented 3 different MT strategies and evaluated their performance when translating English metadata records to Chinese and Spanish. These strategies included combining MT results from 3 online MT systems (Google, Bing, and Yahoo!) with and without additional linguistic resources, such as manually‐generated parallel corpora, and metadata records in the two target languages obtained from international partners. The open‐source statistical MT platform Moses was applied to design and implement the three translation strategies. Human evaluation of the MT results using adequacy and fluency demonstrated that two of the strategies produced higher quality translations than individual online MT systems for both languages. Especially, adding small, manually‐generated parallel corpora of metadata records significantly improved translation performance. Our study suggested an effective and efficient MT approach for providing multilingual services for digital collections.