Multilabel Video Classification Model of Navigation Mark’s Lights Based on Deep Learning | Zendy

Xu Han | Zendy; Mingyang Pan | Zendy; Haipeng Ge | Zendy; Shaoxi Li | Zendy; Jingfeng Hu | Zendy; Lining Zhao | Zendy; Yu Li | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Multilabel Video Classification Model of Navigation Mark’s Lights Based on Deep Learning

Author(s) -

Xu Han,

Mingyang Pan,

Haipeng Ge,

Shaoxi Li,

Jingfeng Hu,

Lining Zhao,

Yu Li

Publication year - 2021

Publication title -

computational intelligence and neuroscience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.605

H-Index - 52

eISSN - 1687-5273

pISSN - 1687-5265

DOI - 10.1155/2021/6794202

Subject(s) - computer science , artificial intelligence , flashing , set (abstract data type) , relevance (law) , computer vision , brightness , frame (networking) , pattern recognition (psychology) , telecommunications , materials science , physics , optics , political science , law , metallurgy , programming language

At night, buoys and other navigation marks disappear to be replaced by fixed or flashing lights. Navigation marks are seen as a set of lights in various colors rather than their familiar outline. Deciphering that the meaning of the lights is a burden to navigators, it is also a new challenging research direction of intelligent sensing of navigation environment. The study studied initiatively the intelligent recognition of lights on navigation marks at night based on multilabel video classification methods. To capture effectively the characteristics of navigation mark's lights, including both color and flashing phase, three different multilabel classification models based on binary relevance, label power set, and adapted algorithm were investigated and compared. According to the experiment's results performed on a data set with 8000 minutes video, the model based on binary relevance, named NMLNet, has highest accuracy about 99.23% to classify 9 types of navigation mark's lights. It also has the fastest computation speed with least network parameters. In the NMLNet, there are two branches for the classifications of color and flashing, respectively, and for the flashing classification, an improved MobileNet-v2 was used to capture the brightness characteristic of lights in each video frame, and an LSTM is used to capture the temporal dynamics of lights. Aiming to run on mobile devices on vessel, the MobileNet-v2 was used as backbone, and with the improvement of spatial attention mechanism, it achieved the accuracy near Resnet-50 while keeping its high speed.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research