Automatic building of a large and complete dataset for image-based table structure recognition | Zendy

Trần Quang Vinh | Zendy; Nguyen Thi Ngọc Diep | Zendy; AUTHOR_ID | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Automatic building of a large and complete dataset for image-based table structure recognition

Author(s) -

Trần Quang Vinh,

AUTHOR_ID,

Nguyen Thi Ngọc Diep,

AUTHOR_ID

Publication year - 2021

Publication title -

tạp chí khoa học đại học quốc gia hà nội: công nghệ thông tin - truyền thông (vnu journal of science: computer science and communication engineering)

Language(s) - English

Resource type - Journals

eISSN - 2615-9260

pISSN - 2588-1086

DOI - 10.25073/2588-1086/vnucsce.293

Subject(s) - table (database) , annotation , computer science , minimum bounding box , bounding overwatch , artificial intelligence , image (mathematics) , correctness , sequence (biology) , pattern recognition (psychology) , representation (politics) , automatic image annotation , information retrieval , data mining , image processing , algorithm , genetics , biology , politics , political science , law

Table is one of the most common ways to represent structured data in documents. Existing researches on image-based table structure recognition often rely on limited datasets with the largest amount of 3,789 human-labeled tables as ICDAR 19 Track B dataset. A recent TableBank dataset for table structures contains 145K tables, however, the tables are labeled in an HTML tag sequence format, which impedes the development of image-based recognition methods. In this paper, we propose several processing methods that automatically convert an HTML tag sequence annotation into bounding box annotation for table cells in one table image. By ensembling these methods, we could convert 42,028 tables with high correctness, which is 11 times larger than the largest existing dataset (ICDAR 19). We then demonstrate that using these bounding box annotations, a straightforward representation of objects in images, we can achieve much higher F1-scores of table structure recognition at many high IoU thresholds using only off-the-shelf deep learning models: F1-score of 0.66 compared to the state-of-the-art of 0.44 for ICDAR19 dataset. A further experiment on using explicit bounding box annotation for image-based table structure recognition results in higher accuracy (70.6%) than implicit text sequence annotation (only 33.8%). The experiments show the effectiveness of our largest-to-date dataset to open up opportunities to generalize on real-world applications. Our dataset and experimental models are publicly available at shorturl.at/hwHY3

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore