Open Access
Designing and developing an automatic interactive keyphrase extraction system with Unified Modeling Language (UML)
Author(s) -
Song Min,
Song IlYeol,
Hu Xiaohua
Publication year - 2004
Publication title -
proceedings of the american society for information science and technology
Language(s) - English
Resource type - Journals
eISSN - 1550-8390
pISSN - 0044-7870
DOI - 10.1002/meet.1450410143
Subject(s) - computer science , unified modeling language , xml , wordnet , natural language processing , information extraction , artificial intelligence , plain text , applications of uml , the internet , information retrieval , programming language , world wide web , encryption , software , operating system
Abstract Designing and developing a system that assists the users in digesting and understanding information available has been a difficult challenge. In this paper, we discuss the design and development of an automatic interactive keyphrase extraction system, called KPSpotter, which is capable of processing various formats of data such as XML, HTML, and plain text through Internet. KPSpotter combines Information Gain data mining measure and several Natural Language Processing (NLP) techniques, such as Part of Speech (POS) technique and First Occurrence of Term. To improve extraction accuracy, WordNet is incorporated into KPSpotter. In designing and developing KPSpotter we utilized Unified Modeling Language (UML). UML modeling helps in the formalization of the preliminary analysis model and accomplishes iterative system design and development. We also conducted experiments for system performance testing by comparing keyphrases extracted by KPSPotter and KEA, a well‐known naïve Baysiean‐based keyphrase extraction system. The experiments show that KPSpotter outperforms KEA in most test cases.