Design and Implementation of the Compound Noun Segmentation Algorithm Based on Statistical Information
Author(s) -
Chang-Geun Kim,
Han-Ho Tack
Publication year - 2004
Publication title -
international journal of fuzzy logic and intelligent systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.296
H-Index - 9
eISSN - 2093-744X
pISSN - 1598-2645
DOI - 10.5391/ijfis.2004.4.3.306
Subject(s) - noun , affix , segmentation , syllable , preference , computer science , natural language processing , artificial intelligence , compound , pattern recognition (psychology) , algorithm , speech recognition , mathematics , statistics
This paper suggests a reverse segmentation algorithm using affix information and some preference pattern information of Korean compound nouns. The structure of Korean compound nouns is mostly derived from Chinese characters, and it includes some preference patterns utilized as a segmentation rule in this paper. To evaluate the accuracy of the proposed algorithm, an experiment was performed with 36,061 compound nouns. The experiment resulted in getting 99.3% of correct segmentation and showed excellent satisfactory results from the comparative experimentation with other algorithms. Especially, most of the four-syllable or five-syllable compound nouns were successfully segmented without fail.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom