
Towards Multimodal Vision-Language Models Generating Non-generic Text
Author(s) -
Wes Robbins
Publication year - 2022
Publication title -
proceedings of the ... aaai conference on artificial intelligence
Language(s) - Uncategorized
Resource type - Journals
eISSN - 2374-3468
pISSN - 2159-5399
DOI - 10.1609/aaai.v36i11.21705
Subject(s) - closed captioning , computer science , artificial intelligence , natural language processing , focus (optics) , context (archaeology) , set (abstract data type) , image (mathematics) , language model , optical character recognition , training set , speech recognition , paleontology , physics , optics , biology , programming language