z-logo
open-access-imgOpen Access
Multilabel Video Classification Model of Navigation Mark’s Lights Based on Deep Learning
Author(s) -
Xu Han,
Mingyang Pan,
Haipeng Ge,
Shaoxi Li,
Jingfeng Hu,
Lining Zhao,
Yu Li
Publication year - 2021
Publication title -
computational intelligence and neuroscience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.605
H-Index - 52
eISSN - 1687-5273
pISSN - 1687-5265
DOI - 10.1155/2021/6794202
Subject(s) - computer science , artificial intelligence , flashing , set (abstract data type) , relevance (law) , computer vision , brightness , frame (networking) , pattern recognition (psychology) , telecommunications , materials science , physics , optics , political science , law , metallurgy , programming language
At night, buoys and other navigation marks disappear to be replaced by fixed or flashing lights. Navigation marks are seen as a set of lights in various colors rather than their familiar outline. Deciphering that the meaning of the lights is a burden to navigators, it is also a new challenging research direction of intelligent sensing of navigation environment. The study studied initiatively the intelligent recognition of lights on navigation marks at night based on multilabel video classification methods. To capture effectively the characteristics of navigation mark's lights, including both color and flashing phase, three different multilabel classification models based on binary relevance, label power set, and adapted algorithm were investigated and compared. According to the experiment's results performed on a data set with 8000 minutes video, the model based on binary relevance, named NMLNet, has highest accuracy about 99.23% to classify 9 types of navigation mark's lights. It also has the fastest computation speed with least network parameters. In the NMLNet, there are two branches for the classifications of color and flashing, respectively, and for the flashing classification, an improved MobileNet-v2 was used to capture the brightness characteristic of lights in each video frame, and an LSTM is used to capture the temporal dynamics of lights. Aiming to run on mobile devices on vessel, the MobileNet-v2 was used as backbone, and with the improvement of spatial attention mechanism, it achieved the accuracy near Resnet-50 while keeping its high speed.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom