Adaptive Triplet Model for Fine-Grained Visual Categorization | Zendy

Jingyun Liang | Zendy; Jinlin Guo | Zendy; Yanming Guo | Zendy; Songyang Lao | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Adaptive Triplet Model for Fine-Grained Visual Categorization

Author(s) -

Jingyun Liang,

Jinlin Guo,

Yanming Guo,

Songyang Lao

Publication year - 2018

Publication title -

ieee access

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.587

H-Index - 127

ISSN - 2169-3536

DOI - 10.1109/access.2018.2884695

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

Fine-grained visual categorization aims at differentiating subcategories, such as different species of birds, models of cars, and variants of aircraft. It often suffers from small inter-variance and large intra-variance. To keep dissimilar images far apart and preserve large intra-variance simultaneously, we propose an adaptive triplet model. At first, images are batched as triplets and input to a general convolutional network, which extracts convolutional image features. Then, we combine adaptive triplet loss and classification loss for multi-task training. Adaptive triplet loss pulls the same-class embeddings together and pushes examples from different subcategories apart. It allocates different weights to hard and easy examples in an adaptive way in the training process. Unlike previous hard mining mechanisms that discard all non-hard triplets, it can benefit from all possible informative examples. Moreover, a second-order distance function is put forward to capture local pairwise interactions of embeddings, which is more discriminative in distance measure. Classification loss is used to provide more direct supervision for training embeddings with category specific concepts. Furthermore, it makes the prediction of category more convenient and more efficient in testing. Experiments demonstrate the state-of-the-art results on three popular fine-grained datasets, including CUB-200-2011, Stanford Cars, and FGVC-Aircraft. In addition, our network structure is relatively simple compared with previous methods, which often suffer from multiple sub-networks and complex training mechanisms. It is also applicable for most up-to-date backbone networks, while others might be restricted to specific convolutional networks.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research