Text-Style Conversion of Speech Transcript into Web Document for Lecture Archive
Author(s) -
Masashi Ito,
Tomohiro Ohno,
Shigeki Matsubara
Publication year - 2009
Publication title -
journal of advanced computational intelligence and intelligent informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.172
H-Index - 20
eISSN - 1343-0130
pISSN - 1883-8014
DOI - 10.20965/jaciii.2009.p0499
Subject(s) - computer science , readability , the internet , world wide web , transcription (linguistics) , redundancy (engineering) , speech synthesis , html , natural language processing , web application , information retrieval , artificial intelligence , linguistics , philosophy , programming language , operating system
It is very significant to the knowledge society to accumulate spoken documents on the web. However, because of the high redundancy of spontaneous speech, the faithfully transcribed text is not readable on an Internet browser, and therefore not suitable as a web document. This paper proposes a technique for converting spoken documents into web documents for the purpose of building a speech archiving system. The technique edits automatically transcribed texts and improves their readability on the browser. The readable text can be generated by applying technology such as paraphrasing, segmentation, and structuring transcribed texts. Editing experiments using lecture data demonstrated the feasibility of the technique. A prototype system of spoken document archiving was implemented to confirm its effectiveness.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom