Data Mining and Principal Component Analysis on Coimbra Breast Cancer Dataset | Zendy

Anupam Sen | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Data Mining and Principal Component Analysis on Coimbra Breast Cancer Dataset

Author(s) -

Anupam Sen

Publication year - 2021

Publication title -

aijr proceedings

Language(s) - English

Resource type - Conference proceedings

ISSN - 2582-3922

DOI - 10.21467/proceedings.115.5

Subject(s) - boosting (machine learning) , principal component analysis , artificial intelligence , computer science , mean squared error , statistic , machine learning , feature selection , feature extraction , cohen's kappa , gradient boosting , cross validation , pattern recognition (psychology) , data mining , mathematics , statistics , random forest

Machine Learning (ML) techniques play an important role in the medical field. Early diagnosis is required to improve the treatment of carcinoma. During this analysis Breast Cancer Coimbra dataset (BCCD) with ten predictors are analyzed to classify carcinoma. In this paper method for feature selection and Machine learning algorithms are applied to the dataset from the UCI repository. WEKA (“Waikato Environment for Knowledge Analysis”) tool is used for machine learning techniques. In this paper Principal Component Analysis (PCA) is used for feature extraction. Different Machine Learning classification algorithms are applied through WEKA such as Glmnet, Gbm, ada Boosting, Adabag Boosting, C50, Cforest, DcSVM, fnn, Ksvm, Node Harvest compares the accuracy and also compare values such as Kappa statistic, Mean Absolute Error (MAE), Root Mean Square Error (RMSE). Here the 10-fold cross validation method is used for training, testing and validation purposes.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research