z-logo
open-access-imgOpen Access
Tabular Data Augmentation Using Artificial Intelligence: A Systematic Review and Taxonomic Framework
Author(s) -
Mauro Henrique Lima De Boni,
Iwens Gervasio Sene,
Ronaldo Martins Da Costa
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3593449
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Context: Tabular data predominate in machine learning applications; however, data scarcity, class imbalance, and privacy-related constraints often impair the model performance. Therefore, AI-centric data-synthesis techniques have been adopted to mitigate these challenges. Objective: To systematically map the state of the art in AI-driven tabular-data augmentation, identifying trends, methodological gaps, and best practices. Method: The review followed Kitchenham’s guidelines and covered the ACM Digital Library, Compendex, IEEE Xplore, ScienceDirect, and Scopus for the period 2020–2024. After deduplication and application of the inclusion and exclusion criteria, 55 primary studies were selected and analyzed with respect to the eight research questions. Results: Of the 55 studies, 210 quantitative results were extracted: 70.95% employed utility metrics, and only 7.14% assessed privacy. To organize this heterogeneous landscape, we propose a taxonomic framework: solution type × Evaluation × metric, highlighting gaps and redundancies. Conclusions: Although research has progressed from conventional synthesizers to hybrid and novel architectures, metric standardization and systematic privacy assessments remain limited. Future work should address these gaps and apply fidelity and privacy metrics to underrepresented tasks such as regression problems and ultra-rare-class datasets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom