Enriched random forests | Zendy

Dhammika Amaratunga | Zendy; Javier Cabrera | Zendy; Yung-Seop Lee | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Enriched random forests

Author(s) -

Dhammika Amaratunga,

Javier Cabrera,

Yung-Seop Lee

Publication year - 2008

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/btn356

Subject(s) - random forest , computer science , simple random sample , sampling (signal processing) , simple (philosophy) , data mining , node (physics) , systematic sampling , artificial intelligence , pattern recognition (psychology) , machine learning , statistics , mathematics , population , philosophy , demography , structural engineering , filter (signal processing) , epistemology , sociology , engineering , computer vision

Although the random forest classification procedure works well in datasets with many features, when the number of features is huge and the percentage of truly informative features is small, such as with DNA microarray data, its performance tends to decline significantly. In such instances, the procedure can be improved by reducing the contribution of trees whose nodes are populated by non-informative features. To some extent, this can be achieved by prefiltering, but we propose a novel, yet simple, adjustment that has demonstrably superior performance: choose the eligible subsets at each node by weighted random sampling instead of simple random sampling, with the weights tilted in favor of the informative features. This results in an 'enriched random forest'. We illustrate the superior performance of this procedure in several actual microarray datasets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research