Don’t split your data | Zendy

Henrik Källberg | Zendy; Lars Alfredsson | Zendy; Maria Feychting | Zendy; Anders Ahlbom | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Don’t split your data

Author(s) -

Henrik Källberg,

Lars Alfredsson,

Maria Feychting,

Anders Ahlbom

Publication year - 2010

Publication title -

european journal of epidemiology

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.825

H-Index - 111

eISSN - 1573-7284

pISSN - 0393-2990

DOI - 10.1007/s10654-010-9447-3

Subject(s) - medicine , data set , bayesian probability , perspective (graphical) , sample (material) , set (abstract data type) , statistics , data mining , artificial intelligence , mathematics , computer science , chemistry , chromatography , programming language

False positive findings are a common problem in whole genome association studies. In this commentary we show that nothing is gained by randomly splitting a data sample to two equal sized subsets, where the first data subset is used for explorative purposes and the other sub set is used to confirm the findings in the first subset. We compare the random splitting procedure to using the full data sample for analysis, by using a Bayesian perspective with consideration taken to prior probability of a false positive finding.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research