Research Library

open-access-imgOpen AccessMake Prompts Adaptable: Bayesian Modeling for Vision-Language Prompt Learning with Data-Dependent Prior
Author(s)
Youngjae Cho,
HeeSun Bae,
Seungjae Shin,
Yeo Dong Youn,
Weonyoung Joo,
Il-Chul Moon
Publication year2024
Recent Vision-Language Pretrained (VLP) models have become the backbone formany downstream tasks, but they are utilized as frozen model without learning.Prompt learning is a method to improve the pre-trained VLP model by adding alearnable context vector to the inputs of the text encoder. In a few-shotlearning scenario of the downstream task, MLE training can lead the contextvector to over-fit dominant image features in the training data. Thisoverfitting can potentially harm the generalization ability, especially in thepresence of a distribution shift between the training and test dataset. Thispaper presents a Bayesian-based framework of prompt learning, which couldalleviate the overfitting issues on few-shot learning application and increasethe adaptability of prompts on unseen instances. Specifically, modelingdata-dependent prior enhances the adaptability of text features for both seenand unseen image features without the trade-off of performance between them.Based on the Bayesian framework, we utilize the Wasserstein Gradient Flow inthe estimation of our target posterior distribution, which enables our promptto be flexible in capturing the complex modes of image features. We demonstratethe effectiveness of our method on benchmark datasets for several experimentsby showing statistically significant improvements on performance compared toexisting methods. The code is available at https://github.com/youngjae-cho/APP.
Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here