Improving data splitting for classification applications in spectrochemical analyses employing a random-mutation Kennard-Stone algorithm approach | Zendy

Camilo L. M. Morais | Zendy; Marfran C. D. Santos | Zendy; Kássio M. G. Lima | Zendy; Francis L. Martin | Zendy

Open Access

Improving data splitting for classification applications in spectrochemical analyses employing a random-mutation Kennard-Stone algorithm approach

Author(s) -

Camilo L. M. Morais,

Marfran C. D. Santos,

Kássio M. G. Lima,

Francis L. Martin

Publication year - 2019

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/btz421

Subject(s) - algorithm , principal component analysis , linear discriminant analysis , computer science , set (abstract data type) , euclidean distance , data set , sample (material) , data mining , artificial intelligence , pattern recognition (psychology) , mathematics , chemistry , chromatography , programming language

Data splitting is a fundamental step for building classification models with spectral data, especially in biomedical applications. This approach is performed following pre-processing and prior to model construction, and consists of dividing the samples into at least training and test sets; herein, the training set is used for model construction and the test set for model validation. Some of the most-used methodologies for data splitting are the random selection (RS) and the Kennard-Stone (KS) algorithms; here, the former works based on a random splitting process and the latter is based on the calculation of the Euclidian distance between the samples. We propose an algorithm called the Morais-Lima-Martin (MLM) algorithm, as an alternative method to improve data splitting in classification models. MLM is a modification of KS algorithm by adding a random-mutation factor.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research