z-logo
open-access-imgOpen Access
Outlier Detection in Automatic Collocation Extraction
Author(s) -
Octavio Santana Suárez,
Isabel Sánchez Berriel,
José Pérez Aguiar,
Virginia Gutiérrez Rodríguez
Publication year - 2015
Publication title -
procedia - social and behavioral sciences
Language(s) - English
Resource type - Journals
ISSN - 1877-0428
DOI - 10.1016/j.sbspro.2015.07.463
Subject(s) - collocation (remote sensing) , outlier , computer science , artificial intelligence , natural language processing , word (group theory) , sample (material) , volume (thermodynamics) , pattern recognition (psychology) , linguistics , machine learning , chemistry , physics , chromatography , quantum mechanics , philosophy
In this paper we have analysed different association measures between words, generally used for the automatic extraction of collocations in textual corpus. Specifically, they have been considered: relative frequency, mutual information, z-score, t-score and Dunning's test. The volume of handled corpus (3 words) requires reviewing of the usual approach to this matter, so a solution that is based on methods used to detect statistical outliers is proposed. It is evident from the results that a lot of free combinations extracted with collocations coming from the comparison of words with very different frequencies of use. For this reason, they are applied considering that each word generates a different sample, instead of generating rankings which come from corpus considered as a single sample. The experiment is also performed on a corpus with a much smaller amount of words and the results are reported so contrasted with those obtained with the full corpus. The conclusions and contributions arising give response automatic extraction of collocations from a textual corpus regardless its volume

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom