Person re-identification across aerial and ground-based cameras by deep feature fusion
Author(s) -
Arne Schumann,
Jürgen Metzler
Publication year - 2017
Publication title -
proceedings of spie, the international society for optical engineering/proceedings of spie
Language(s) - English
Resource type - Conference proceedings
SCImago Journal Rank - 0.192
H-Index - 176
eISSN - 1996-756X
pISSN - 0277-786X
DOI - 10.1117/12.2262295
Subject(s) - computer science , artificial intelligence , computer vision , identification (biology) , sensor fusion , task (project management) , aerial image , deep learning , matching (statistics) , image (mathematics) , engineering , biology , statistics , mathematics , systems engineering , botany
Person re-identification is the task of correctly matching visual appearances of the same person in image or video data while distinguishing appearances of different persons. The traditional setup for re-identification is a network of fixed cameras. However, in recent years mobile aerial cameras mounted on unmanned aerial vehicles (UAV) have become increasingly useful for security and surveillance tasks. Aerial data has many characteristics different from typical camera network data. Thus, re-identification approaches designed for a camera network scenario can be expected to suffer a drop in accuracy when applied to aerial data. In this work, we investigate the suitability of features, which were shown to give robust results for re- identification in camera networks, for the task of re-identifying persons between a camera network and a mobile aerial camera. Specifically, we apply hand-crafted region covariance features and features extracted by convolutional neural networks which were learned on separate data. We evaluate their suitability for this new and as yet unexplored scenario. We investigate common fusion methods to combine the hand-crafted and learned features and propose our own deep fusion approach which is already applied during training of the deep network. We evaluate features and fusion methods on our own dataset. The dataset consists of fourteen people moving through a scene recorded by four fixed ground-based cameras and one mobile camera mounted on a small UAV. We discuss strengths and weaknesses of the features in the new scenario and show that our fusion approach successfully leverages the strengths of each feature and outperforms all single features significantly
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom