Discriminating between empirical studies and nonempirical works using automated text classification | Zendy

Langlois Alexis | Zendy; Nie JianYun | Zendy; Thomas James | Zendy; Hong Quan Nha | Zendy; Pluye Pierre | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Discriminating between empirical studies and nonempirical works using automated text classification

Author(s) -

Langlois Alexis,

Nie JianYun,

Thomas James,

Hong Quan Nha,

Pluye Pierre

Publication year - 2018

Publication title -

research synthesis methods

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.376

H-Index - 35

eISSN - 1759-2887

pISSN - 1759-2879

DOI - 10.1002/jrsm.1317

Subject(s) - computer science , filter (signal processing) , set (abstract data type) , empirical research , information retrieval , data mining , data set , artificial intelligence , statistics , mathematics , computer vision , programming language

Objective: Identify the most performant automated text classification method (eg, algorithm) for differentiating empirical studies from nonempirical works in order to facilitate systematic mixed studies reviews. Methods: The algorithms were trained and validated with 8050 database records, which had previously been manually categorized as empirical or nonempirical. A Boolean mixed filter developed for filtering MEDLINE records (title, abstract, keywords, and full texts) was used as a baseline. The set of features (eg, characteristics from the data) included observable terms and concepts extracted from a metathesaurus. The efficiency of the approaches was measured using sensitivity, precision, specificity, and accuracy. Results: The decision trees algorithm demonstrated the highest performance, surpassing the accuracy of the Boolean mixed filter by 30%. The use of full texts did not result in significant gains compared with title, abstract, keywords, and records. Results also showed that mixing concepts with observable terms can improve the classification. Significance: Screening of records, identified in bibliographic databases, for relevant studies to include in systematic reviews can be accelerated with automated text classification.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research