Keyphrase Extraction for Technical Language Processing | Zendy

Alden A. Dima | Zendy; Aaron K. Massey | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Keyphrase Extraction for Technical Language Processing

Author(s) -

Alden A. Dima,

Aaron K. Massey

Publication year - 2022

Publication title -

journal of research of the national institute of standards and technology

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.202

H-Index - 59

eISSN - 2165-7254

pISSN - 1044-677X

DOI - 10.6028/jres.126.053

Subject(s) - computer science , semeval , classifier (uml) , natural language processing , metadata , information retrieval , artificial intelligence , measure (data warehouse) , task (project management) , world wide web , database , management , economics

Keyphrase extraction is an important facet of annotation tools that offer theprovision of the metadata necessary for technical language processing (TLP). Because TLPimposes additional requirements on typical natural language processing (NLP) methods, weexamined TLP keyphrase extraction through the lens of a hypothetical toolkit whichconsists of a combination of text features and classifiers suitable for use inlow-resource TLP applications. We compared two approaches for keyphrase extraction: Thefirst which applied our toolkit-based methods that used only distributional features ofwords and phrases, and the second was the Maui automatic topic indexer, a well-knownacademic method. Performance was measured against two collections of technicalliterature: 1153 articles from Journal of Chemical Thermodynamics (JCT) curated by theNational Institute of Standards and Technology Thermodynamics Research Center (TRC) and244 articles from Task 5 of the Workshop on Semantic Evaluation (SemEval). Bothcollections have author-provided keyphrases available; the SemEval articles also havereader-provided keyphrases. Our findings indicate that our toolkit approach wascompetitive with Maui when author-provided keyphrases were first removed from the text.For the TRC-JCT articles, the Maui automatic topic indexer reported an F -measure of29.4 % while our toolkit approach obtained an F -measure of 28.2 %. For the SemEvalarticles, our toolkit approach using a Naïve Bayes classifier resulted in an F -measureof 20.8 %, which outperformed Maui’s F -measure of 18.8 %.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore