On Multilabel Classification Methods of Incompletely Labeled Biomedical Text Data | Zendy

Anton Kolesov | Zendy; Dmitry Kamyshenkov | Zendy; Maria Litovchenko | Zendy; Elena M. Smekalova | Zendy; Alexey Golovizin | Zendy; Alex Zhavoronkov | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

On Multilabel Classification Methods of Incompletely Labeled Biomedical Text Data

Author(s) -

Anton Kolesov,

Dmitry Kamyshenkov,

Maria Litovchenko,

Elena M. Smekalova,

Alexey Golovizin,

Alex Zhavoronkov

Publication year - 2014

Publication title -

computational and mathematical methods in medicine

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.462

H-Index - 48

eISSN - 1748-6718

pISSN - 1748-670X

DOI - 10.1155/2014/781807

Subject(s) - classifier (uml) , computer science , artificial intelligence , training set , pattern recognition (psychology) , support vector machine , labeled data , machine learning , set (abstract data type) , similarity (geometry) , data mining , image (mathematics) , programming language

Multilabel classification is often hindered by incompletely labeled training datasets; for some items of such dataset (or even for all of them) some labels may be omitted. In this case, we cannot know if any item is labeled fully and correctly. When we train a classifier directly on incompletely labeled dataset, it performs ineffectively. To overcome the problem, we added an extra step, training set modification, before training a classifier. In this paper, we try two algorithms for training set modification: weighted k-nearest neighbor (WkNN) and soft supervised learning (SoftSL). Both of these approaches are based on similarity measurements between data vectors. We performed the experiments on AgingPortfolio (text dataset) and then rechecked on the Yeast (nontext genetic data). We tried SVM and RF classifiers for the original datasets and then for the modified ones. For each dataset, our experiments demonstrated that both classification algorithms performed considerably better when preceded by the training set modification step.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research