
Likelihood Models of Somatic Mutation and Codon Substitution in Cancer Genes
Author(s) -
Ziheng Yang,
Simon Weonsang Ro,
Bruce Rannala
Publication year - 2003
Publication title -
genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.792
H-Index - 246
eISSN - 1943-2631
pISSN - 0016-6731
DOI - 10.1093/genetics/165.2.695
Subject(s) - biology , genetics , missense mutation , mutation rate , nonsense mutation , mutation , germline mutation , silent mutation , gene , transversion , synonymous substitution , codon usage bias , genome
The role of somatic mutation in cancer is well established and several genes have been identified that are frequent targets. This has enabled large-scale screening studies of the spectrum of somatic mutations in cancers of particular organs. Cancer gene mutation databases compile the results of many studies and can provide insight into the importance of specific amino acid sequences and functional domains in cancer, as well as elucidate aspects of the mutation process. Past studies of the spectrum of cancer mutations (in particular genes) have examined overall frequencies of mutation (at specific nucleotides) and of missense, nonsense, and silent substitution (at specific codons) both in the sequence as a whole and in a specific functional domain. Existing methods ignore features of the genetic code that allow some codons to mutate to missense, or stop, codons more readily than others (i.e., by one nucleotide change, vs. two or three). A new codon-based method to estimate the relative rate of substitution (fixation of a somatic mutation in a cancer cell lineage) of nonsense vs. missense mutations in different functional domains and in different tumor tissues is presented. Models that account for several potential influences on rates of somatic mutation and substitution in cancer progenitor cells and allow biases of mutation rates for particular dinucleotide sequences (CGs and dipyrimidines), transition vs. transversion bias, and variable rates of silent substitution across functional domains (useful in detecting investigator sampling bias) are considered. Likelihood-ratio tests are used to choose among models, using cancer gene mutation data. The method is applied to analyze published data on the spectrum of p53 mutations in cancers. A novel finding is that the ratio of the probability of nonsense to missense substitution is much lower in the DNA-binding and transactivation domains (ratios near 1) than in structural domains such as the linker, tetramerization (oligomerization), and proline-rich domains (ratios exceeding 100 in some tissues), implying that the specific amino acid sequence may be less critical in structural domains (e.g., amino acid changes less often lead to cancer). The transition vs. transversion bias and effect of CpG dinucleotides on mutation rates in p53 varied greatly across cancers of different organs, likely reflecting effects of different endogenous and exogenous factors influencing mutation in specific organs.