Domain‐independent automatic keyphrase indexing with small training sets | Zendy

Medelyan Olena | Zendy; Witten Ian H. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Domain‐independent automatic keyphrase indexing with small training sets

Author(s) -

Medelyan Olena,

Witten Ian H.

Publication year - 2008

Publication title -

journal of the american society for information science and technology

Language(s) - English

Resource type - Journals

eISSN - 1532-2890

pISSN - 1532-2882

DOI - 10.1002/asi.20790

Subject(s) - computer science , information retrieval , search engine indexing , consistency (knowledge bases) , automatic indexing , domain (mathematical analysis) , set (abstract data type) , thesaurus , cataloging , key (lock) , vocabulary , natural language processing , world wide web , artificial intelligence , linguistics , programming language , mathematical analysis , philosophy , mathematics , computer security

Keyphrases are widely used in both physical and digital libraries as a brief, but precise, summary of documents. They help organize material based on content, provide thematic access, represent search results, and assist with navigation. Manual assignment is expensive because trained human indexers must reach an understanding of the document and select appropriate descriptors according to defined cataloging rules. We propose a new method that enhances automatic keyphrase extraction by using semantic information about terms and phrases gleaned from a domain‐specific thesaurus. The key advantage of the new approach is that it performs well with very little training data. We evaluate it on a large set of manually indexed documents in the domain of agriculture, compare its consistency with a group of six professional indexers, and explore its performance on smaller collections of documents in other domains and of French and Spanish documents.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research