Premium
Retrieving Domain‐Specific Collocations by Co‐occurrences and Word Order Constraints
Author(s) -
Shimohata Sayori,
Sugio Toshiyuki,
Nagata Junji
Publication year - 1999
Publication title -
computational intelligence
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.353
H-Index - 52
eISSN - 1467-8640
pISSN - 0824-7935
DOI - 10.1111/0824-7935.00085
Subject(s) - computer science , natural language processing , artificial intelligence , domain (mathematical analysis) , word (group theory) , speech recognition , linguistics , mathematics , mathematical analysis , philosophy
In this paper, we describe a method for automatically retrieving collocations from large text corpora. This method comprises the following stages: (1) extracting strings of characters as units of collocations, and (2) extracting recurrent combinations of strings as collocations. Through this method, various types of domain‐specific collocations can be retrieved simultaneously. This method is practical because it uses plain text with no specific‐language‐dependent information, such as lexical knowledge and parts of speech. Experimental results using English and Japanese text corpora show that the method is equally applicable to both languages.