Learning from Data: Cleft Lip and Palate Patients in the West Coast of Sabah | Zendy

Zaturrawiah Ali Omar | Zendy; Su Na Chin | Zendy; Norhafiza Hamzah | Zendy; Fouziah Md Yassin | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Learning from Data: Cleft Lip and Palate Patients in the West Coast of Sabah

Author(s) -

Zaturrawiah Ali Omar,

Su Na Chin,

Norhafiza Hamzah,

Fouziah Md Yassin

Publication year - 2019

Publication title -

journal of physics. conference series

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.21

H-Index - 85

eISSN - 1742-6596

pISSN - 1742-6588

DOI - 10.1088/1742-6596/1358/1/012063

Subject(s) - cluster analysis , resampling , computer science , random forest , artificial intelligence , euclidean distance , sample (material) , class (philosophy) , feature selection , pattern recognition (psychology) , machine learning , data mining , chromatography , chemistry

Analysing data can be quite a challenge sometimes due to the nature of the data and the vast options of methods and techniques that can be used on the data. In this study, for example, a six years Cleft Lip and Palate dataset were gathered on these patients’ conditions in the quest to identify the contributing factors for a successful pre-graft orthodontic treatment. The challenges faced was in the small number of datasets and imbalance sample class. Therefore, this study had taken a step back and tried to approach the dataset with a combination of unsupervised and supervised learning methods to tackle the challenges by incorporating clustering - for testing records creation and; resampling - for balancing sample class. We also observed if the auto-created testing records are replaceable with the manually selected testing records by looking at the performances of the classification models. Based on the feature that was selected, k-Means and PAM were implemented as the clustering algorithm using the Euclidean formula as the distance measure. Resampling was done using SMOTE and Random Forest as the classification model. When the comparison was done on the models, the ones that were fed by resampled training records showed an increase in the AUC values and decrease in the OOB error. Comparable results were also achieved between the training records produced by PAM and by manual selection as both models, based on the AUC values, was classified as excellent classification models.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore