Research Library

open-access-imgOpen AccessPrompt-aligned Gradient for Prompt Tuning
Author(s)
Beier Zhu,
Yulei Niu,
Yucheng Han,
Yue Wu,
Hanwang Zhang
Publication year2024
Thanks to the large pre-trained vision-language models (VLMs) like CLIP, wecan craft a zero-shot classifier by "prompt", e.g., the confidence score of animage being "[CLASS]" can be obtained by using the VLM provided similaritymeasure between the image and the prompt sentence "a photo of a [CLASS]".Therefore, prompt shows a great potential for fast adaptation of VLMs todownstream tasks if we fine-tune the prompt-based similarity measure. However,we find a common failure that improper fine-tuning may not only undermine theprompt's inherent prediction for the task-related classes, but also for otherclasses in the VLM vocabulary. Existing methods still address this problem byusing traditional anti-overfitting techniques such as early stopping and dataaugmentation, which lack a principled solution specific to prompt. We presentPrompt-aligned Gradient, dubbed ProGrad, to prevent prompt tuning fromforgetting the the general knowledge learned from VLMs. In particular, ProGradonly updates the prompt whose gradient is aligned (or non-conflicting) to the"general direction", which is represented as the gradient of the KL loss of thepre-defined prompt prediction. Extensive experiments demonstrate the strongerfew-shot generalization ability of ProGrad over state-of-the-art prompt tuningmethods. Codes are available at https://github.com/BeierZhu/Prompt-align.
Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here