Generic text summarization using relevance measure and latent semantic analysis
Author(s) -
Yihong Gong,
Xin Liu
Publication year - 2001
Publication title -
citeseer x (the pennsylvania state university)
Language(s) - English
Resource type - Book series
ISBN - 1-58113-331-6
DOI - 10.1145/383952.383955
Subject(s) - automatic summarization , computer science , latent semantic analysis , relevance (law) , information retrieval , redundancy (engineering) , multi document summarization , weighting , natural language processing , ranking (information retrieval) , sentence , rank (graph theory) , text graph , artificial intelligence , mathematics , medicine , radiology , combinatorics , political science , law , operating system
In this paper, we propose two generic text summarization methods that create text summaries by ranking and extracting sentences from the original documents. The first method uses standard IR methods to rank sentence relevances, while the second method uses the latent semantic analysis technique to identify semantically important sentences, for summary creations. Both methods strive to select sentences that are highly ranked and different from each other. This is an attempt to create a summary with a wider coverage of the document's main content and less redundancy. Performance evaluations on the two summarization methods are conducted by comparing their summarization outputs with the manual summaries generated by three independent human evaluators. The evaluations also study the influence of different VSM weighting schemes on the text summarization performances. Finally, the causes of the large disparities in the evaluators' manual summarization results are investigated, and discussions on human text summarization patterns are presented.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom