Referring expression comprehension model with matching detection and linguistic feedback | Zendy

Wang Jianming | Zendy; Cui Enjie | Zendy; Liu Kunliang | Zendy; Sun Yukuan | Zendy; Liang Jiayu | Zendy; Yuan Chunmiao | Zendy; Duan Xiaojie | Zendy; Jin Guanghao | Zendy; Chung TaeSun | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Referring expression comprehension model with matching detection and linguistic feedback

Author(s) -

Wang Jianming,

Cui Enjie,

Liu Kunliang,

Sun Yukuan,

Liang Jiayu,

Yuan Chunmiao,

Duan Xiaojie,

Jin Guanghao,

Chung TaeSun

Publication year - 2020

Publication title -

iet computer vision

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.38

H-Index - 37

eISSN - 1751-9640

pISSN - 1751-9632

DOI - 10.1049/iet-cvi.2019.0483

Subject(s) - expression (computer science) , laptop , computer science , artificial intelligence , image (mathematics) , set (abstract data type) , parsing , object (grammar) , natural language processing , regular expression , table (database) , object detection , computer vision , matching (statistics) , natural language , pattern recognition (psychology) , data mining , programming language , mathematics , statistics , operating system

The task of referring expression comprehension (REC) is to localise an image region of a specific object described by a natural language expression, and all existing REC methods assume that the object described by the referring expression must be located in the given image. However, this assumption is not correct in some real applications. For example, a visually impaired user might tell his robot ‘please take the laptop on the table to me’. In fact, the laptop is not on the table anymore. To address this problem, the authors propose a novel REC model to deal with the situation where expression‐image mismatching occurs and explain the mismatching by linguistic feedback. The authors' REC model consists of four modules: the expression parsing module, the entity detection module, the relationship detection module, and the matching detection module. They built a data set called NP‐RefCOCO+ from RefCOCO+ including both positive samples and negative samples. The positive samples are original expression‐image pairs in RefCOCO+. The negative samples are the expression‐image pairs in RefCOCO+, whose expressions are replaced. They evaluate the model on NP‐RefCOCO+ and the experimental results show the advantages of their method for dealing with the problem of expression‐image mismatching.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore