Research Library

open-access-imgOpen AccessLearning to Prompt Segment Anything Models
Author(s)
Jiaxing Huang,
Kai Jiang,
Jingyi Zhang,
Han Qiu,
Lewei Lu,
Shijian Lu,
Eric Xing
Publication year2024
Segment Anything Models (SAMs) like SEEM and SAM have demonstrated greatpotential in learning to segment anything. The core design of SAMs lies withPromptable Segmentation, which takes a handcrafted prompt as input and returnsthe expected segmentation mask. SAMs work with two types of prompts includingspatial prompts (e.g., points) and semantic prompts (e.g., texts), which worktogether to prompt SAMs to segment anything on downstream datasets. Despite theimportant role of prompts, how to acquire suitable prompts for SAMs is largelyunder-explored. In this work, we examine the architecture of SAMs and identifytwo challenges for learning effective prompts for SAMs. To this end, we proposespatial-semantic prompt learning (SSPrompt) that learns effective semantic andspatial prompts for better SAMs. Specifically, SSPrompt introduces spatialprompt learning and semantic prompt learning, which optimize spatial promptsand semantic prompts directly over the embedding space and selectively leveragethe knowledge encoded in pre-trained prompt encoders. Extensive experimentsshow that SSPrompt achieves superior image segmentation performanceconsistently across multiple widely adopted datasets.
Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here