Research Library

open-access-imgOpen AccessLLaMA Pro: Progressive LLaMA with Block Expansion
Author(s)
Chengyue Wu,
Yukang Gan,
Yixiao Ge,
Zeyu Lu,
Jiahao Wang,
Ye Feng,
Ping Luo,
Ying Shan
Publication year2024
Humans generally acquire new skills without compromising the old; however,the opposite holds for Large Language Models (LLMs), e.g., from LLaMA toCodeLLaMA. To this end, we propose a new post-pretraining method for LLMs withan expansion of Transformer blocks. We tune the expanded blocks using only newcorpus, efficiently and effectively improving the model's knowledge withoutcatastrophic forgetting. In this paper, we experiment on the corpus of code andmath, yielding LLaMA Pro-8.3B, a versatile foundation model initialized fromLLaMA2-7B, excelling in general tasks, programming, and mathematics. LLaMA Proand its instruction-following counterpart (LLaMA Pro-Instruct) achieve advancedperformance among various benchmarks, demonstrating superiority over existingopen models in the LLaMA family and the immense potential of reasoning andaddressing diverse tasks as an intelligent agent. Our findings provide valuableinsights into integrating natural and programming languages, laying a solidfoundation for developing advanced language agents that operate effectively invarious environments.
Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here