Premium
KEYWORD EXTRACTION STRATEGY FOR ITEM BANKS TEXT CATEGORIZATION
Author(s) -
Nuntiyagul Atorn,
Naruedomkul Kanlaya,
Cercone Nick,
Wongsawang Damras
Publication year - 2007
Publication title -
computational intelligence
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.353
H-Index - 52
eISSN - 1467-8640
pISSN - 0824-7935
DOI - 10.1111/j.1467-8640.2007.00293.x
Subject(s) - categorization , computer science , selection (genetic algorithm) , keyword extraction , natural language processing , phrase , sentence , artificial intelligence , text categorization , feature selection , information retrieval
We proposed a feature selection approach, Patterned Keyword in Phrase ( PKIP ), to text categorization for item banks. The item bank is a collection of textual question items that are short sentences. Each sentence does not contain enough relevant words for directly categorizing by the traditional approaches such as “bag‐of‐words.” Therefore, PKIP was designed to categorize such question item using only available keywords and their patterns. PKIP identifies the appropriate keywords by computing the weight of all words. In this paper, two keyword selection strategies are suggested to ensure the categorization accuracy of PKIP. PKIP was implemented and tested with the item bank of Thai high primary mathematics questions. The test results have proved that PKIP is able to categorize the question items correctly and the two keyword selection strategies can extract the very informative keywords.