z-logo
open-access-imgOpen Access
Turkish Optical Character Recognition Under the Lens: A Systematic Review of Language-Specific Challenges, Dataset Scarcity, and Open-Source Limitations
Author(s) -
Mirac Goksu Ozturk,
Durmus Ozkan Sahin,
Erdal Kilic
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3614147
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
This systematic literature review explores the progress, challenges, and opportunities in the field of Optical Character Recognition (OCR) for the Turkish language. Despite significant advancements, the development of robust Turkish OCR systems faces several obstacles, such as a lack of publicly available datasets, limited open-source solutions, and the underutilization of cutting-edge deep learning techniques. These challenges hinder the creation of OCR systems that can match the capabilities of those developed for languages like English. Focusing on 38 peer-reviewed studies published between 2019 and 2023, this paper provides the first systematic review of Turkish OCR research, offering a comprehensive analysis of the current methods, datasets, and evaluation metrics across both modern Turkish (Latin script) and Ottoman Turkish (Arabic script) contexts. Our findings highlight that while Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Convolutional Recurrent Neural Networks (CRNN) architectures are frequently used, Transformer-based and end-to-end models remain underexplored in Turkish OCR. We also identify data scarcity and the lack of reproducible benchmark datasets as key barriers. By analyzing current research trends, pinpointing challenges, and emphasizing opportunities for future advancements, this review aims to be a valuable resource for researchers and practitioners in Turkish language text recognition. Our study contributes to the field by offering a structured overview of existing methods and proposes practical recommendations for improving dataset availability, encouraging open-source collaboration, and adopting more advanced model architectures.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom