
TSFnet: a new two-stream fusion framework for scene text detection
Author(s) -
Ziheng Zhou,
Xuezhuan Zhao,
Lishen Pei,
Li Lao,
Jiahao Pan
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1651/1/012178
Subject(s) - fuse (electrical) , robustness (evolution) , computer science , segmentation , artificial intelligence , merge (version control) , pyramid (geometry) , pattern recognition (psychology) , fusion , data stream , regression , data mining , mathematics , statistics , telecommunications , biochemistry , chemistry , linguistics , geometry , philosophy , information retrieval , electrical engineering , gene , engineering
In order to solve the problem of algorithm robustness caused by scale change and imbalanced distribution of classes in the scene text detection task, we propose a new two-stream fusion framework TSFnet. It is constructed by the Detection Stream, the Judge Stream and the Merge output algorithm. In the Detection Stream, we propose a loss balance factor (LBF), which is used to optimize region proposal network. Then, the Regression-net and the Segmentation-net are used to predict text global segmentation map and its corresponding coordinates probability score. In Judge Stream, we use the feature pyramid network to extract the Judge map. In the process, the LBF is calculated to support the Detection Stream. Finally, we design a novel algorithm to fuse the outputs of the two-stream, and the precise position of the text is localized. Extensive experiments are conducted on the ICDAR 2015 and the ICDAR2017-MLT standard datasets. The results demonstrate that the framework performance is comparable with the sate-of-the-art approaches.