Open AccessPrompt-aligned Gradient for Prompt TuningOpen Access
Author(s)
Beier Zhu,
Yulei Niu,
Yucheng Han,
Yue Wu,
Hanwang Zhang
Publication year2024
Thanks to the large pre-trained vision-language models (VLMs) like CLIP, wecan craft a zero-shot classifier by "prompt", e.g., the confidence score of animage being "[CLASS]" can be obtained by using the VLM provided similaritymeasure between the image and the prompt sentence "a photo of a [CLASS]".Therefore, prompt shows a great potential for fast adaptation of VLMs todownstream tasks if we fine-tune the prompt-based similarity measure. However,we find a common failure that improper fine-tuning may not only undermine theprompt's inherent prediction for the task-related classes, but also for otherclasses in the VLM vocabulary. Existing methods still address this problem byusing traditional anti-overfitting techniques such as early stopping and dataaugmentation, which lack a principled solution specific to prompt. We presentPrompt-aligned Gradient, dubbed ProGrad, to prevent prompt tuning fromforgetting the the general knowledge learned from VLMs. In particular, ProGradonly updates the prompt whose gradient is aligned (or non-conflicting) to the"general direction", which is represented as the gradient of the KL loss of thepre-defined prompt prediction. Extensive experiments demonstrate the strongerfew-shot generalization ability of ProGrad over state-of-the-art prompt tuningmethods. Codes are available at https://github.com/BeierZhu/Prompt-align.
Language(s)English
Seeing content that should not be on Zendy? Contact us.
To access your conversation history and unlimited prompts, please
Prompt 0/10