MACGT: multi-dimensional automated clustering genotyping tool for analysis of microarray-based mini-sequencing data
Author(s) -
David C. Walley,
Ben Tripp,
Young C. Song,
Keith R. Walley,
Scott J. Tebbutt
Publication year - 2006
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btl080
Subject(s) - genotyping , cluster analysis , snp , snp genotyping , computer science , dna microarray , single nucleotide polymorphism , computational biology , genotype , snp array , data mining , biology , artificial intelligence , genetics , gene expression , gene
Multi-dimensional Automated Clustering Genotyping Tool (MACGT) is a Java application that clusters complex multi-dimensional vector data derived from single nucleotide polymorphism (SNP) genotyping experiments using mini-sequencing based microarray chemistries such as arrayed primer extension (APEX). Spot intensity output files from microarray experiments across multiple samples are imported into MACGT. The datasets can include four channels of intensity data for each spot, replica spots for each SNP probe and multiple probe types (APEX and allele-specific APEX probes) on both DNA strands for each SNP. MACGT automatically clusters these multi-dimensionality datasets for each SNP across multiple samples. Incorporation of additional array datasets from known samples that have previously validated SNP genotype calls allows unknown samples to be automatically assigned a genotype based on the clustering, along with numerical measures of confidence for each genotype call. Calling accuracy by MACGT exceeds 98% when applied to genotyping data from APEX microarrays, and can be increased to >99.5% by applying thresholds to the confidence measures.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom