
The self‐assessment trap: can we all be better than average?
Author(s) -
Norel Raquel,
Rice John Jeremy,
Stolovitzky Gustavo
Publication year - 2011
Publication title -
molecular systems biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 8.523
H-Index - 148
ISSN - 1744-4292
DOI - 10.1038/msb.2011.70
Subject(s) - biology , trap (plumbing) , computational biology , physics , meteorology
Mol Syst Biol. 7: 537Computational systems biology seems to be caught in what we call the ‘self‐assessment trap’, in which researchers wishing to publish their analytical methods are required by referees or by editorial policy (e.g., Bioinformatics, BMC Bioinformatics, Nucleic Acids Research) to compare the performance of their own algorithms against other methodologies, thus being forced to be judge, jury and executioner. The result is that the authors’ method tends to be the best in an unreasonable majority of cases (Table I). In many instances, this bias is the result of selective reporting of performance in the niche in which the method is superior. Evidence of that is that most papers reporting best performance choose only one or two metrics of performance, but when the number of performance metrics is larger than two, most methods fail to be the best in all categories assessed (Table I). Choosing many metrics can dramatically change the determination of best performance (Supplementary Table S1). Selective reporting can be inadvertent, but in some cases biases are more disingenuous, involving hiding information or quietly cutting corners in the performance evaluation (similar problems have been discussed in assessments of the performance of supercomputers, e.g., Bailey (1991)).View this table:Table 1. Break out of 57 surveyed papers in which the authors assess their own methodsEven assuming that there is no selective reporting, we would like to argue that papers reporting good‐yet‐not‐the‐best methods (of which we found none in our literature survey of self‐assessed papers …