Premium
On the relationship between cumulative correlation coefficients and the quality of crystallographic data sets
Author(s) -
Wang Jimin,
Brudvig Gary W.,
Batista Victor S.,
Moore Peter B.
Publication year - 2017
Publication title -
protein science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.353
H-Index - 175
eISSN - 1469-896X
pISSN - 0961-8368
DOI - 10.1002/pro.3314
Subject(s) - correlation , correlation coefficient , data set , resolution (logic) , measure (data warehouse) , pearson product moment correlation coefficient , statistics , mathematics , set (abstract data type) , quality (philosophy) , data quality , r value (soils) , data mining , physics , computer science , artificial intelligence , geometry , metric (unit) , operations management , quantum mechanics , subgrade , economics , programming language , structural engineering , engineering
In 2012, Karplus and Diederichs demonstrated that the Pearson correlation coefficient CC 1/2 is a far better indicator of the quality and resolution of crystallographic data sets than more traditional measures like merging R‐factor or signal‐to‐noise ratio. More specifically, they proposed that CC 1/2 be computed for data sets in thin shells of increasing resolution so that the resolution dependence of that quantity can be examined. Recently, however, the CC 1/2 values of entire data sets, i.e., cumulative correlation coefficients , have been used as a measure of data quality. Here, we show that the difference in cumulative CC 1/2 value between a data set that has been accurately measured and a data set that has not is likely to be small. Furthermore, structures obtained by molecular replacement from poorly measured data sets are likely to suffer from extreme model bias.