Premium
Estimating Genotypic Correlations and Their Standard Errors Using Multivariate Restricted Maximum Likelihood Estimation with SAS Proc MIXED
Author(s) -
Holland James B.
Publication year - 2006
Publication title -
crop science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.76
H-Index - 147
eISSN - 1435-0653
pISSN - 0011-183X
DOI - 10.2135/cropsci2005.0191
Subject(s) - restricted maximum likelihood , statistics , missing data , estimator , sample size determination , mathematics , multivariate statistics , confidence interval , multivariate analysis of variance , covariance , type i and type ii errors , parametric statistics , estimation theory
Plant breeders traditionally have estimated genotypic and phenotypic correlations between traits using the method of moments on the basis of a multivariate analysis of variance (MANOVA). Drawbacks of using the method of moments to estimate variance and covariance components include the possibility of obtaining estimates outside of parameter bounds, reduced estimation efficiency, and ignorance of the estimators' distributional properties when data are missing. An alternative approach that does not suffer these problems, but depends on the assumption of normally distributed random effects and large sample sizes, is restricted maximum likelihood (REML). This paper illustrates the use of Proc MIXED of the SAS system to implement REML estimation of genotypic and phenotypic correlations. Additionally, a method to obtain approximate parametric estimates of the sampling variances of the correlation estimates is presented. MANOVA and REML methods were compared with a real data set and with simulated data. The simulation study examined the effects of different correlation parameter values, genotypic and environmental sample sizes, and proportion of missing data on Type I and Type II error rates and on accuracy of confidence intervals. The two methods provided similar results when data were balanced or only 5% of data were missing. However, when 15 or 25% data were missing, the REML method generally performed better, resulting in higher power of detection of correlations and more accurate 95% confidence intervals. Samples of at least 75 genotypes and two environments are recommended to obtain accurate confidence intervals using the proposed method.