
Text Detection in Trademark Images under Semi-Supervision
Author(s) -
Xiaoling Shi,
Shaozhi Wang,
Xiao Tan
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1873/1/012014
Subject(s) - trademark , computer science , annotation , character (mathematics) , construct (python library) , artificial intelligence , detector , economic shortage , process (computing) , image (mathematics) , pattern recognition (psychology) , natural language processing , linguistics , mathematics , telecommunications , government (linguistics) , programming language , operating system , philosophy , geometry
The texts in trademark image usually contain characters, text lines and other different parts. Therefore, compared with the detectors that can detect text lines directly, the detector that can detect characters is more suitable for the detection of texts in trademark images However, training character detectors needs a huge number of characters with position annotation, the con-struction process of this type of dataset will take a lot of time. In order to overcome the limitations of the detectors trained by datasets with word-level annotations and the shortage of datasets with character-level annotations, we propose a text detect method that can detect the single character in trademark images. Our proposed model is pretrained by the synthetic image dataset with character level annotation, then the pretrained model is used to detect the unannotated trademark images dataset to find more annotations for retrain the pretrained model. We also construct a trademark text dataset, which includes 4500 images. Experimental results on our dataset demonstrate our method can achieve the state-of-the-art performance compared to other methods. The results show our detector can detect complicated texts in trademark images.