Evaluation of header metadata extraction approaches and tools for scientific PDF documents | Zendy

Mario Lipinski | Zendy; Kevin Yao | Zendy; Corinna Breitinger | Zendy; Joeran Beel | Zendy; Béla Gipp | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Evaluation of header metadata extraction approaches and tools for scientific PDF documents

Author(s) -

Mario Lipinski,

Kevin Yao,

Corinna Breitinger,

Joeran Beel,

Béla Gipp

Publication year - 2013

Publication title -

kops (university of konstanz)

Language(s) - English

Resource type - Conference proceedings

ISSN - 2575-7865

DOI - 10.1145/2467696.2467753

Subject(s) - metadata , computer science , header , metadata repository , digital library , data element , information retrieval , task (project management) , meta data services , world wide web , metadata management , software , strengths and weaknesses , metadata modeling , database , engineering , art , computer network , philosophy , literature , poetry , systems engineering , epistemology , programming language

This paper evaluates the performance of tools for the extraction of metadata from scientific articles. Accurate metadata extraction is an important task for automating the management of digital libraries. This comparative study is a guide for developers looking to integrate the most suitable and effective metadata extraction tool into their software. We shed light on the strengths and weaknesses of seven tools in common use. In our evaluation using papers from the arXiv collection, GROBID delivered the best results, followed by Mendeley Desktop. SciPlore Xtract, PDFMeat, and SVMHeaderParse also delivered good results depending on the metadata type to be extracted.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research