Research Library

open-access-imgOpen AccessICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers
Author(s)
Chen Zheng,
Ke Sun,
Da Tang,
Yukun Ma,
Yuyu Zhang,
Chenguang Xi,
Xun Zhou
Publication year2024
The emergence of Large Language Models (LLMs) such as ChatGPT and LLaMAencounter limitations in domain-specific tasks, with these models often lackingdepth and accuracy in specialized areas, and exhibiting a decrease in generalcapabilities when fine-tuned, particularly analysis ability in small sizedmodels. To address these gaps, we introduce ICE-GRT, utilizing ReinforcementLearning from Human Feedback (RLHF) grounded in Proximal Policy Optimization(PPO), demonstrating remarkable ability in in-domain scenarios withoutcompromising general task performance. Our exploration of ICE-GRT highlightsits understanding and reasoning ability to not only generate robust answers butalso to provide detailed analyses of the reasons behind the answer. Thiscapability marks a significant progression beyond the scope of SupervisedFine-Tuning models. The success of ICE-GRT is dependent on several crucialfactors, including Appropriate Data, Reward Size Scaling, KL-Control, AdvantageNormalization, etc. The ICE-GRT model exhibits state-of-the-art performance indomain-specific tasks and across 12 general Language tasks against equivalentsize and even larger size LLMs, highlighting the effectiveness of our approach.We provide a comprehensive analysis of the ICE-GRT, underscoring thesignificant advancements it brings to the field of LLM.
Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here