z-logo
open-access-imgOpen Access
Digital Handling Procedure for Digitalizing and Indexing Ancient Manuscripts
Author(s) -
Setiawan Hadi,
Undang Ahmad Darsa,
Erick Paulus,
Mira Suryani,
Jean-Christophe Burie
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3621551
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
This paper presents a comprehensive digital handling procedure for the digitization and indexing of ancient Sundanese palm-leaf manuscripts. The proposed workflow integrates high-resolution image acquisition, glyph-level segmentation, expert-driven annotation, and TEI-compliant XML indexing to preserve the linguistic and cultural information embedded in these historical documents. A hybrid segmentation approach, combining manual and projection-based techniques, was employed, with expert-guided annotation ensuring label accuracy. To evaluate the framework, we constructed a benchmark dataset comprising 3,527 manually annotated glyphs from 10 manuscript pages. Experimental results demonstrate strong performance: segmentation accuracy of 90.23% with precision of 93.51% and recall of 96.27%, and inter-annotator agreement of 0.87 (Cohen’s Kappa), confirming annotation consistency. Furthermore, a baseline syllable recognition test using MobileNetV2 achieved 88.5% Top-1 accuracy, 94.2% Top-3 accuracy, and a macro F1-score of 0.87. Compared to conventional bounding-box segmentation, our polygon-based approach better handles irregular glyph contours, leading to more reliable OCR performance. These findings validate the technical feasibility and robustness of the proposed method, particularly for low-resource scripts with complex writing systems. Overall, this study contributes a reproducible workflow that not only supports cultural heritage preservation but also lays the groundwork for large-scale digital archives and machine-readable manuscript corpora.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom