Reducing false positives in molecular pattern recognition. | Zendy

Xijin  Ge | Zendy; Shuichi  Tsutsumi | Zendy; Hiroyuki  Aburatani | Zendy; Shuichi  Iwata | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Reducing false positives in molecular pattern recognition.

Author(s) -

Xijin Ge,

Shuichi Tsutsumi,

Hiroyuki Aburatani,

Shuichi Iwata

Publication year - 2003

Publication title -

genome informatics. international conference on genome informatics

Language(s) - English

DOI - 10.11234/gi1990.14.34

In the search for new cancer subtypes by gene expression profiling, it is essential to avoid misclassifying samples of unknown subtypes as known ones. In this paper, we evaluated the false positive error rates of several classification algorithms through a 'null test' by presenting classifiers a large collection of independent samples that do not belong to any of the tumor types in the training dataset. The benchmark dataset is available at www2.genome.rcast.u-tokyo.ac.jp/pm/. We found that k-nearest neighbor (KNN) and support vector machine (SVM) have very high false positive error rates when fewer genes (<100) are used in prediction. The error rate can be partially reduced by including more genes. On the other hand, prototype matching (PM) method has a much lower false positive error rate. Such robustness can be achieved without loss of sensitivity by introducing suitable measures of prediction confidence. We also proposed a cluster-and-select technique to select genes for classification. The nonparametric Kruskal-Wallis H test is employed to select genes differentially expressed in multiple tumor types. To reduce the redundancy, we then divided these genes into clusters with similar expression patterns and selected a given number of genes from each cluster. The reliability of the new algorithm is tested on three public datasets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research