z-logo
open-access-imgOpen Access
Don’t split your data
Author(s) -
Henrik Källberg,
Lars Alfredsson,
Maria Feychting,
Anders Ahlbom
Publication year - 2010
Publication title -
european journal of epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.825
H-Index - 111
eISSN - 1573-7284
pISSN - 0393-2990
DOI - 10.1007/s10654-010-9447-3
Subject(s) - medicine , data set , bayesian probability , perspective (graphical) , sample (material) , set (abstract data type) , statistics , data mining , artificial intelligence , mathematics , computer science , chemistry , chromatography , programming language
False positive findings are a common problem in whole genome association studies. In this commentary we show that nothing is gained by randomly splitting a data sample to two equal sized subsets, where the first data subset is used for explorative purposes and the other sub set is used to confirm the findings in the first subset. We compare the random splitting procedure to using the full data sample for analysis, by using a Bayesian perspective with consideration taken to prior probability of a false positive finding.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom