CSD: Channel Selection Dropout for Regularization of Convolutional Neural Networks
Author(s) -
Imrus Salehin,
Dae-Ki Kang
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3616631
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
In this study, we present a novel approach, Channel Selection Dropout (CSD), designed to regularize deep convolutional neural network (CNN) architectures. Unlike standard Dropout, which randomly deactivates neurons in fully connected layers, CSD works on the image channels within the sequence of convolutional layers. Specifically, CSD is composed of three modules, i.e., Channel Process Module, Channel Drop Module, and Scale Module. CSD primarily emphasizes channels to identify significant channels based on activation values. It preserves channels that possess values above a user-defined threshold α. Channels that are less significant are set to a value of zero. CSD does not add any extra expenses during the CNN architecture training phase. It is used only during training and deployed only at a minimal computational cost. In the testing phase, the network retains its original state, resulting in no added expenses for inference. Moreover, CSD integration into current networks does not require re-pretraining on ImageNet. This makes it fit seamlessly with other datasets. Finally, the performance of CSD with ResNet-18, ResNet-50, and VGGNet-16 is experimentally evaluated across multiple datasets. Our results demonstrate that setting α= 0.60 significantly enhances performance, with most results reaching over 95% accuracy on the benchmark datasets. However, the performance of α may vary, and it can be adjusted based on the specific dataset and architecture used. The comprehensive results clarify that CSD consistently enhances performance over the baselines. This method can be applied in future CNN applications to mitigate overfitting, particularly in image segmentation, Vision Transformers (ViT), and medical imaging.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom