z-logo
Premium
minotaur : A platform for the analysis and visualization of multivariate results from genome scans with R Shiny
Author(s) -
Verity Robert,
Collins Caitlin,
Card Daren C.,
Schaal Sara M.,
Wang Liuyang,
Lotterhos Katie E.
Publication year - 2017
Publication title -
molecular ecology resources
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.96
H-Index - 136
eISSN - 1755-0998
pISSN - 1755-098X
DOI - 10.1111/1755-0998.12579
Subject(s) - outlier , mahalanobis distance , multivariate statistics , biology , visualization , genome , computer science , data mining , data visualization , parametric statistics , computational biology , artificial intelligence , machine learning , statistics , mathematics , genetics , gene
Genome scans are widely used to identify ‘outliers’ in genomic data: loci with different patterns compared with the rest of the genome due to the action of selection or other nonadaptive forces of evolution. These genomic data sets are often high dimensional, with complex correlation structures among variables, making it a challenge to identify outliers in a robust way. The Mahalanobis distance has been widely used, but has the major limitation of assuming that data follow a simple parametric distribution. Here, we develop three new metrics that can be used to identify outliers in multivariate space, while making no strong assumptions about the distribution of the data. These metrics are implemented in the R package minotaur , which also includes an interactive web‐based application for visualizing outliers in high‐dimensional data sets. We illustrate how these metrics can be used to identify outliers from simulated genetic data and discuss some of the limitations they may face in application.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here