Digital Handling Procedure for Digitalizing and Indexing Ancient Manuscripts
Author(s) -
Setiawan Hadi,
Undang Ahmad Darsa,
Erick Paulus,
Mira Suryani,
Jean-Christophe Burie
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3621551
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
This paper presents a comprehensive digital handling procedure for the digitization and indexing of ancient Sundanese palm-leaf manuscripts. The proposed workflow integrates high-resolution image acquisition, glyph-level segmentation, expert-driven annotation, and TEI-compliant XML indexing to preserve the linguistic and cultural information embedded in these historical documents. A hybrid segmentation approach, combining manual and projection-based techniques, was employed, with expert-guided annotation ensuring label accuracy. To evaluate the framework, we constructed a benchmark dataset comprising 3,527 manually annotated glyphs from 10 manuscript pages. Experimental results demonstrate strong performance: segmentation accuracy of 90.23% with precision of 93.51% and recall of 96.27%, and inter-annotator agreement of 0.87 (Cohen’s Kappa), confirming annotation consistency. Furthermore, a baseline syllable recognition test using MobileNetV2 achieved 88.5% Top-1 accuracy, 94.2% Top-3 accuracy, and a macro F1-score of 0.87. Compared to conventional bounding-box segmentation, our polygon-based approach better handles irregular glyph contours, leading to more reliable OCR performance. These findings validate the technical feasibility and robustness of the proposed method, particularly for low-resource scripts with complex writing systems. Overall, this study contributes a reproducible workflow that not only supports cultural heritage preservation but also lays the groundwork for large-scale digital archives and machine-readable manuscript corpora.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom