Complete sequencing of expanded SAMD12 repeats by long-read sequencing and Cas9-mediated enrichment
Author(s) -
Takeshi Mizuguchi,
Tomoko Toyota,
Satoko Miyatake,
Satomi Mitsuhashi,
Hiroshi Doi,
Yosuke Kudo,
Hitaru Kishida,
Noriko Hayashi,
Rie Tsuburaya,
Masako Kinoshita,
Tetsuhiro Fukuyama,
Hiromi Fukuda,
Eriko Koshimizu,
Naomi Tsuchida,
Yuri Uchiyama,
Atsushi Fujita,
Atsushi Takata,
Noriko Miyake,
Mitsuhiro Kato,
Fumiaki Tanaka,
Hiroaki Adachi,
Naomichi Matsumoto
Publication year - 2021
Publication title -
brain
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 5.142
H-Index - 336
eISSN - 1460-2156
pISSN - 0006-8950
DOI - 10.1093/brain/awab021
Subject(s) - genetics , biology , dna sequencing , cas9 , direct repeat , sequence (biology) , deep sequencing , genotype , computational biology , crispr , gene , genome
A pentanucleotide TTTCA repeat insertion into a polymorphic TTTTA repeat element in SAMD12 causes benign adult familial myoclonic epilepsy. Although the precise determination of the entire SAMD12 repeat sequence is important for molecular diagnosis and research, obtaining this sequence remains challenging when using conventional genomic/genetic methods, and even short-read and long-read next-generation sequencing technologies have been insufficient. Incomplete information regarding expanded repeat sequences may hamper our understanding of the pathogenic roles played by varying numbers of repeat units, genotype–phenotype correlations, and mutational mechanisms. Here, we report a new approach for the precise determination of the entire expanded repeat sequence and present a workflow designed to improve the diagnostic rates in various repeat expansion diseases. We examined 34 clinically diagnosed benign adult familial myoclonic epilepsy patients, from 29 families using repeat-primed PCR, Southern blot, and long-read sequencing with Cas9-mediated enrichment. Two cases with questionable results from repeat-primed PCR and/or Southern blot were confirmed as pathogenic using long-read sequencing with Cas9-mediated enrichment, resulting in the identification of pathogenic SAMD12 repeat expansions in 76% of examined families (22/29). Importantly, long-read sequencing with Cas9-mediated enrichment was able to provide detailed information regarding the sizes, configurations, and compositions of the expanded repeats. The inserted TTTCA repeat size and the proportion of TTTCA sequences among the overall repeat sequences were highly variable, and a novel repeat configuration was identified. A genotype–phenotype correlation study suggested that the insertion of even short (TTTCA)14 repeats contributed to the development of benign adult familial myoclonic epilepsy. However, the sizes of the overall TTTTA and TTTCA repeat units are also likely to be involved in the pathology of benign adult familial myoclonic epilepsy. Seven unsolved SAMD12-negative cases were investigated using whole-genome long-read sequencing, and infrequent, disease-associated, repeat expansions were identified in two cases. The strategic workflow resolved two questionable SAMD12-positive cases and two previously SAMD12-negative cases, increasing the diagnostic yield from 69% (20/29 families) to 83% (24/29 families). This study indicates the significant utility of long-read sequencing technologies to explore the pathogenic contributions made by various repeat units in complex repeat expansions and to improve the overall diagnostic rate.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom