z-logo
open-access-imgOpen Access
Improving Label Noise Filtering by Exploiting Unlabeled Data
Author(s) -
Donghai Guan,
Hongqiang Wei,
Weiwei Yuan,
Guangjie Han,
Yuan Tian,
Mohammed Al-Dhelaan,
Abdullah Al-Dhelaan
Publication year - 2018
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2018.2807779
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
With the significant growth in the scale of data, an increasing amount of training data is available in many machine learning tasks. However, it is difficult to ensure perfect labeling with a large volume of training data. Some labels can be incorrect, resulting in label noise, which could lead to deterioration in learning performance. A common way to address label noise is to apply noise filtering techniques to identify and remove noise prior to learning. Multiple noise filtering approaches have been proposed. However, almost all existing works focus on only mislabeled training data and ignore the existence of unlabeled data. In fact, unlabeled data are common in many applications, and their values have been extensively studied and recognized. Therefore, in this paper, we explore the effective use of unlabeled data to improve the noise filtering performance. To this end, we propose a novel noise filtering algorithm called enhanced soft majority voting by exploiting unlabeled data (ESMVU), which is an ensemble-learning-based filter that adopts a soft majority voting strategy. ESMVU provides a systematic way to measure the value of unlabeled data by considering different aspects, such as label confidence and the sample distribution. Finally, the effectiveness of the proposed method is confirmed by experiments and comparison with other methods.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom