z-logo
Premium
Estimation of Missing Values Affects Important Aspects of GGE Biplot Analysis
Author(s) -
Woyann Leomar Guilherme,
Benin Giovani,
Storck Lindolfo,
Trevizan Diego Maciel,
Meneguzzi Cátia,
Marchioro Volmir Sergio,
Tonatto Matheus,
Madureira Alana
Publication year - 2017
Publication title -
crop science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.76
H-Index - 147
eISSN - 1435-0653
pISSN - 0011-183X
DOI - 10.2135/cropsci2016.02.0100
Subject(s) - biplot , missing data , ammi , statistics , gene–environment interaction , mathematics , main effect , principal component analysis , interaction , estimation , variation (astronomy) , biology , genotype , engineering , genetics , physics , systems engineering , gene , astrophysics
Multi‐environment trials often yield unbalanced datasets, thus necessitating the estimation of missing values. It is unknown whether this estimation affects the graphic characteristics of genotype plus genotype‐by‐environment interaction (GGE) biplots. Therefore, our objectives were to investigate the effects of different percentages of missing values on the number of significant principal components (PCs) and on mega environments, “winner” (highest‐performing) genotypes, and the amount of variation explained by the PCs. Two complete sets of two‐way data from wheat ( Triticum aestivum L.) were used. The first set consisted of the original data (Data1, from which we created scenarios with 0, 30, and 60% missing data. For the second dataset (Data2), we removed 50% data from the original dataset, estimated missing values to make it a new complete dataset, and created scenarios like those for Data1. Missing values were estimated via expectation‐maximization–GGE (EM–GGE) and EM–additive main effects and multiplicative interaction (EM–AMMI) methods. The percentage of variation explained by the PCs was affected by the percentage of missing data; a large percentage of missing values considerably increased the amount of variation explained by PC 1 and PC 2 and reduced the complexity of the genotype‐by‐environment interaction because two PCs accounted for more than 80% of the variation, instead of the three PCs that were required to explain the variation in the original dataset. The EM–GGE estimation method was able to maintain the original conformation of the ‘which‐won‐where’ biplot when ≤30% of estimated data were used. The EM–GGE was superior to the EM–AMMI method for estimating missing data. The estimation of more than 30% of the data should be avoided because it can lead to significant changes in mega environment conformation and the identification of “winner” genotypes.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here