Premium
Analysis of mutational spectra: locating hotspots and clusters of mutations using recursive segmentation
Author(s) -
Fijal Bonnie A.,
Idury Ramana M.,
Witte John S.
Publication year - 2002
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.1145
Subject(s) - mutation , gene , segmentation , genetics , biology , computational biology , gene mutation , disease , computer science , medicine , artificial intelligence , pathology
Mutations within different regions of disease‐causing genes can vary in their impact on disease initiation and progression. Determining how individual mutations within such genes affect disease risk and progression can improve the accuracy of prognoses and help guide treatment selection. Estimates of mutation‐specific risks can be poor, however, when genes have a large number of distinct mutations, and data for any given mutation is sparse. To address this problem, we present here a method of analysing the spectrum of mutations observed across a gene that pools together mutations that appear to have similar effects on disease. One of the assumptions underlying the analysis of mutational spectra created in this manner is that the frequency of the mutation in the sample reflects the degree of its effect on disease development. Additionally, mutations that disrupt the same functionally important region of the gene are expected to have a similar impact on disease development. These mutations tend to form a cluster within the spectrum. Therefore, we developed an algorithm that segments a spectrum into regions containing sites with similar mutational frequencies, and have derived by simulation equations that allow one to evaluate whether segmentation is needed. We used this approach to investigate the spectrum of mutations observed in the p53 tumour suppressor gene in colorectal cancer tumours. Here, recursive segmentation identified the boundaries of apparent clusters better than did other methods, and this approach could identify clusters of mutations which corresponded to biologically important regions of the p53 protein. Copyright © 2002 John Wiley & Sons, Ltd.