A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents | Zendy

Berlin Chen | Zendy; HsinMin Wang | Zendy; Lin-Shan Lee | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents

Author(s) -

Berlin Chen,

HsinMin Wang,

Lin-Shan Lee

Publication year - 2004

Publication title -

acm transactions on asian language information processing

Language(s) - English

Resource type - Journals

eISSN - 1558-3430

pISSN - 1530-0226

DOI - 10.1145/1034780.1034784

Subject(s) - computer science , mandarin chinese , discriminative model , search engine indexing , hidden markov model , artificial intelligence , n gram , speech recognition , vector space model , syllable , natural language processing , word (group theory) , pattern recognition (psychology) , language model , philosophy , linguistics

In recent years, statistical modeling approaches have steadily gained in popularity in the field of information retrieval. This article presents an HMM/N-gram-based retrieval approach for Mandarin spoken documents. The underlying characteristics and the various structures of this approach were extensively investigated and analyzed. The retrieval capabilities were verified by tests with word- and syllable-level indexing features and comparisons to the conventional vector-space model approach. To further improve the discrimination capabilities of the HMMs, both the expectation-maximization (EM) and minimum classification error (MCE) training algorithms were introduced in training. Fusion of information via indexing word- and syllable-level features was also investigated. The spoken document retrieval experiments were performed on the Topic Detection and Tracking Corpora (TDT-2 and TDT-3). Very encouraging retrieval performance was obtained.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research