Premium
RNA sequencing, de novo assembly, and functional annotation of an endangered N ymphalid butterfly, F abriciana nerippe F elder, 1862
Author(s) -
Hwang HeeJu,
Patnaik Bharat Bhusan,
Kang Se Won,
Park So Young,
Wang Tae Hun,
Park Eun Bi,
Chung Jong Min,
Song Dae Kwon,
Patnaik Hongray Howrelia,
Kim Changmu,
Kim Soonok,
Lee Jae Bong,
Jeong Heon Cheon,
Park Hong Seog,
Han Yeon Soo,
Lee Yong Seok
Publication year - 2016
Publication title -
entomological research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.421
H-Index - 20
eISSN - 1748-5967
pISSN - 1738-2297
DOI - 10.1111/1748-5967.12164
Subject(s) - biology , unigene , sequence assembly , transcriptome , genetics , microsatellite , endangered species , evolutionary biology , gene , computational biology , ecology , habitat , gene expression , allele
Grassland butterflies are considered representative indicators of biodiversity and ecosystem health. Their dramatic decline caused by habitat destruction, intensifying agriculture and global warming has prompted concern for conservation. While ecological indices of butterflies have been documented, there is a lack of genomic resources relative to the biological importance of this group. Here, we report, first whole‐transcriptomic resource for F abriciana nerippe , a brush‐footed butterfly member that is endangered in K orea. Approximately, 241.3 million clean reads were obtained from paired‐end I llumina sequencing of adult whole‐body RNA . The de novo assembly resulted in 114 405 unigenes with length ranging from 133 to 33 218 bp. We found 41 868 assembled unigenes homologous to sequences in locally curated PANM‐DB ( P rotostome DB ). We assigned gene ontology ( GO ) and K yoto E ncyclopedia of G enes and G enomes ( KEGG ) orthology terms to 18 085 and 5779 unigenes, respectively. Inter P ro S can predicted 6691 protein domains for the assembled unigene sequences, with the reverse transcriptase and zinc finger domains found to be the most predominant. A total of 315 282 simple sequence repeats ( SSR s) were identified from the assembled unigenes, when considering dinucleotide to octanucleotide repeats with a minimum of three repeat units. AT / AT , AAT / ATT , and AAAT / ATTT represented the most prominent SSR repeat types in the transcriptome. The unigene annotation profile and microsatellites generated in the study can serve as a reference resource for closely related species and gene functional analysis studies. Our data can be utilized to study ecological drift and loss of the species, consequently leading to better protection of the population.