z-logo
open-access-imgOpen Access
Application of Variable Length N-Gram Vectors to Monolingual and Bilingual Information Retrieval
Author(s) -
Daniel Gayo-Avello,
Darío Álvarez-Gutiérrez,
José Gayo-Avello
Publication year - 2005
Publication title -
lecture notes in computer science
Language(s) - English
Resource type - Book series
SCImago Journal Rank - 0.249
H-Index - 400
eISSN - 1611-3349
pISSN - 0302-9743
ISBN - 3-540-27420-0
DOI - 10.1007/11519645_7
Subject(s) - clef , automatic summarization , computer science , n gram , natural language processing , artificial intelligence , gram , cosine similarity , vector space model , metric (unit) , language model , information retrieval , speech recognition , pattern recognition (psychology) , operations management , management , economics , task (project management) , biology , bacteria , genetics
Our group in the Department of Informatics at the University of Oviedo has participated, for the first time, in two tasks at CLEF: monolingual (Russian) and bilingual (Spanish-to-English) information retrieval. Our main goal was to test the application to IR of a modified version of the n-gram vector space model (codenamed blindLight). This new approach has been successfully applied to other NLP tasks such as language identification or text summarization and the results achieved at CLEF 2004, although not exceptional, are encouraging. There are two major differences between the blindLight approach and classical techniques: (1) relative frequencies are no longer used as vector weights but are replaced by n-gram significances, and (2) cosine distance is abandoned in favor of a new metric inspired by sequence alignment techniques, not so computationally expensive. In order to perform cross-language IR we have developed a naive n-gram pseudo-translator similar to those described by McNamee and Mayfield or Pirkola et al.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom