
Optimizing Machine Learning-Based Ovarian Cancer Prediction Through Normalization Strategies
Author(s) -
Roopashri Shetty,
Siddhant Gupta,
Vansh Mediratta,
Shwetha Rai,
M Geetha
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3590871
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Ovarian cancer is one of the most challenging cancers to detect early, often leading to poor survival rates. This study explores supervised and unsupervised machine learning and deep learning approaches to improve predictive performance using clinical and biomarker-based data which was scaled through 2 popular techniques: Min-Max scaling and Z-Score normalization. The research begins by carefully preprocessing the dataset including feature selection to ensure high-quality inputs. Various baseline and ensemble classifiers, including K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), and Logistic Regression (LR), are tested, for better model efficiency on both datasets. To further boost performance, ensemble methods like Stacking, Bagging, and Gradient Boosting, are incorporated. Additionally, unsupervised models like K-Means and DBSCAN clustering are implemented to study further subgroups of the Ovarian Cancer dataset optimizing results. The effects of different feature selection techniques and the impact of standardization versus normalization are compared on both datasets. The Min-Max normalization technique outperformed Z-Score and it is observed that, the Stacking classifier achieved the highest accuracy of 100%, followed by SVM, Logistic Regression, and Bagging, each recording an accuracy of 97%. Further, DBSCAN, a clustering technique outperformed K-Means with a Silhouette Score of 0.7245 and it is observed that clustering performed well with Min-Max when compared with Z-Score normalization technique. The findings highlight that a well-optimized combination of feature selection, ensemble learning, and clustering significantly enhances ovarian cancer prediction , providing a valuable foundation for early diagnosis and clinical decision support.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom