Open Access
Text Summarization in Multi Document Using Genetic Algorithm
Author(s) -
Nirwana Hendrastuty,
Azhari Sn
Publication year - 2021
Publication title -
indonesian journal of computing and cybernetics systems
Language(s) - English
Resource type - Journals
eISSN - 2460-7258
pISSN - 1978-1520
DOI - 10.22146/ijccs.66026
Subject(s) - automatic summarization , computer science , information retrieval , natural language processing , sentence , similarity (geometry) , feature (linguistics) , value (mathematics) , focus (optics) , multi document summarization , precision and recall , identification (biology) , artificial intelligence , linguistics , image (mathematics) , machine learning , philosophy , physics , botany , biology , optics
Automatic text summarization is a representation of a document that contains the essence or main focus of the document. Text summarization is automatically performed using the extraction method. The extraction method summarizes by copying the text that is considered the most important or most informative from the source text into a summary [1]. Documents can be divided into two types, namely single documents and multi documents. Multi document is input that comes from many documents from one or more sources that have more than one main idea.This study aims to summarize the text using a Genetic Algorithm by paying attention to the extraction of text features on each chromosome. The feature extraction used is sentence position, positive keywords, negative keywords, similarity between sentences, sentences containing entity words, sentences containing numbers, sentence length, connections between sentences, the number of connections between sentences. The number of chromosomes used is half of the number of public complaints. The data used is data on public complaints against the DIY government from February 2018 to July 2020. The data is obtained from the e-lapor DIY website. From the test results, the average value of Precision 1, Recall is 0.71, and f-measure value is 0.79.