z-logo
open-access-imgOpen Access
Finding Biomarkers from a High-Dimensional Imbalanced Dataset Using the Hybrid Method of Random Undersampling and Lasso
Author(s) -
Masithoh Yessi Rochayani,
Umu Sa’adah,
Ani Budi Astuti
Publication year - 2020
Publication title -
comtech/comtech
Language(s) - English
Resource type - Journals
eISSN - 2476-907X
pISSN - 2087-1244
DOI - 10.21512/comtech.v11i2.6452
Subject(s) - undersampling , computer science , lasso (programming language) , selection (genetic algorithm) , gene selection , class (philosophy) , high dimensional , artificial intelligence , machine learning , gene , biology , gene expression , genetics , microarray analysis techniques , world wide web
The research conducted undersampling and gene selection as a starting point for cancer classification in gene expression datasets with a high-dimensional and imbalanced class. It investigated whether implementing undersampling before gene selection gave better results than without implementing undersampling. The used undersampling method was Random Undersampling (RUS), and for gene selection, it was Lasso. Then, the selected genes based on theory were validated. To explore the effectiveness of applying RUS before gene selection, the researchers used two gene expression datasets. Both of the datasets consisted of two classes, 1.545 observations and 10.935 genes, but had a different imbalance ratio. The results show that the proposed gene selection methods, namely Lasso and RUS + Lasso, can produce several important biomarkers, and the obtained model has high accuracy. However, the model is complicated since it involves too many genes. It also finds that undersampling is not affected when it is implemented in a less imbalanced class. Meanwhile, when the dataset is highly imbalanced, undersampling can remove a lot of information from the majority class. Nevertheless, the effectiveness of undersampling remains unclear. Simulation studies can be carried out in the next research to investigate when undersampling should be implemented.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here