
Enhancing Task-Incremental Learning via a Prompt-Based Hybrid Convolutional Neural Networks (CNNs)-Vision Transformer (ViT) Framework
Author(s) -
Zuomin Yang,
Anis Salwa Mohd Khairuddin,
Joon Huang Chuah,
Wei Ru Wong,
Xin Xu,
Hafiz Muhammad Fahad Noman,
Qiyuan Qin
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3597020
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Artificial neural network (ANN) models are widely used in various fields such as image classification, multi-object detection, intent prediction, military applications, and natural language processing. However, artificial intelligence (AI) models for continual learning (CL) are not yet mature, and "catastrophic forgetting (CF)" is still a typical problem. The study of biological neural networks (BNNs) and ANN models still needs further exploration. Therefore, this paper mainly explores the pre- and postsynaptic structures, the synaptic cleft, the early and late stages of long-term potentiation, and the effects of neurotransmitters on synaptic excitation and inhibition. We emphasize the necessity of integrating biological neural systems and ANN models in learning and memory. Based on the "Prompt Pool", this paper designs a hybrid neural network (HNN) architecture that integrates convolutional neural networks (CNNs), visual transformers (ViT), prompt pools, and adapters to alleviate the "CF" problem in task incremental learning (TIL). Compared with the existing ViT and prompt pool architecture, this method shows higher performance in final task training and also shows certain advantages in the persistence of task incremental learning (TIL). In the future, based on the principles of biological neuroscience, we will further apply the HNN model to image classification and multi-object detection tasks in autonomous driving. By gaining a deeper understanding of the BNN mechanisms, we will develop efficient HNN models that can adapt to dynamic environments and provide new solutions for CL.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom