Nucleic Acids Research
Author(s) -
Michael McClelland
Publication year - 2010
Publication title -
nucleic acids research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 9.008
H-Index - 537
eISSN - 1362-4954
pISSN - 0305-1048
DOI - 10.1093/nar/gkq631
Subject(s) - biology , nucleic acid , computational biology , genetics
ArrayOme is a new program that calculates the size of genomes represented by microarray-based probes and facilitates recognition of key bacterial strains carrying large numbers of novel genes. Protein-coding sequences (CDS) that are contiguous on annotated reference templates and classified as ‘Present’ in the test strain by hybridization to microarrays are merged into ICs (ICs). These ICs are then extended to account for flanking intergenic sequences. Finally, the lengths of all extended ICs are summated to yield the ‘microarray-visualized genome (MVG)’ size. We tested and validated ArrayOme using both experimental and in silicogenerated genomic hybridization data. MVG sizing of five sequenced Escherichia coli and Shigella strains resulted in an accuracy of 97–99%, as compared to true genome sizes, when the comprehensive ShE.coli meta-array gene sequences (6239 CDS) were used for in silico hybridization analysis. However, the E.coli CFT073 genome size was underestimated by 14% as this meta-array lacked probes for many CFT073 CDS. ArrayOme permits rapid recognition of discordances between PFGE-measured genome and MVG sizes, thereby enabling highthroughput identification of strains rich in novel genes. Gene discovery studies focused on these strains will greatly facilitate characterization of the global gene pool accessible to individual bacterial species. INTRODUCTION To date, the entire genomic sequence of more than 180 bacterial strains has been determined. Based on comparative analysis of multiple genomes of the same species, it is increasingly apparent that some bacteria possess an extremely plastic genome (1–4). Foreign DNA segments, acquired via horizontal gene transfer, result in a genomic mosaic that reflects the lifestyle of the bacterium, pathogenic traits, adaptation to particular ecological niches and evolutionary history (5). This ‘optional’ genomic repertoire, which we refer to as the ‘mobilome’ (mobile genome), includes episomal plasmids, transposons, integrons, prophages and a growing list of genomic islands (GIs) (6,7). Pathogenicity islands, the virulence-associated subset of GIs, have now been identified in many bacterial species and are undoubtedly recognized as major players in the moulding of pathogenic traits. The high cost of genome sequencing has been a barrier to high-throughput prospecting of the mobilome (8). Even costly mega-scale metagenomics projects do not facilitate this process as the derived data are largely skewed towards abundant DNA sequences and low-prevalence mobilome sequences would rarely fit within a wider genomic context (9). A rapid and more cost-effective approach to discovering strain-specific DNA sporadically dispersed among hundred of members of the same species remains a major challenge (10). Since DNA microarrays were first used to compare the genomes of Mycobacterium bovis BCG strains with that of Mycobacterium tuberculosis strain H37Rv to reveal several strain-specific deletions (11), comparative genomic hybridization technology has been extensively applied to investigate genome diversity among distinct isolates of many bacterial species including Bacillus anthracis (12), Brucella spp. (13), *To whom correspondence should be addressed at Department of Infection, Immunity and Inflammation, Leicester Medical School, University of Leicester, Maurice Shock Building, University Road, PO Box 138, Leicester LE1 9HN, UK. Tel: +44 0 116 2231498; Fax: +44 0 116 2525030; Email: kr46@le.ac.uk The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use permissions, please contact journals.permissions@oupjournals.org.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom