Premium
Differing O‐Glycan‐Forming Glycosyltransferase expression profiles in Cancer Cells Act as Signatures that Accurately Identify Cancer Types/Subtypes, Epithelial‐Mesenchymal Transforming Cells as well as Cancer Stem Cells
Author(s) -
Abuelela Ayman F,
Merzaban Jasmeen S
Publication year - 2017
Publication title -
the faseb journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.709
H-Index - 277
eISSN - 1530-6860
pISSN - 0892-6638
DOI - 10.1096/fasebj.31.1_supplement.784.12
Subject(s) - cancer , cancer cell , metastasis , cancer stem cell , biology , cancer research , phenotype , glycosyltransferase , epithelial–mesenchymal transition , gene , genetics
Glycosyltransferases drive the formation of aberrant forms of mucin O‐glycan structures on tumors. Altered glycosyltransferase expression profiles are attributed to critical events in tumor formation, mesenchymal transformation, and metastasis of cancer cells. These profiles reflect the disparity among different cancer types as well as the heterogeneity within each type of cancer. This heterogeneity among cancer cells is an important factor to consider in cancer therapy, as a small subset of cells is believed to drive detrimental cancer outcomes such as relapse and metastasis. The aim of this study was to develop a classifier that can use the expression profile of O‐glycan‐forming glycosyltransferases (OGFGTs) to (1) discriminate between normal and cancer, (2) classify cancer types and (3) identify molecular subtypes and rare subpopulations such as cancer stem cells (CSC) and cells undergoing epithelial‐mesenchymal transition (EMT). We found that 58 OGFGTs expressed in The Cancer Genome Atlas (TCGA) RNA‐Seq data of 11015 cancer patient samples was able to classify 37 types of cancer with a balanced accuracy of 93.75% (95% CI: 92.86%, 94.55%) and a by‐class accuracy ranging from 90.6% to 100%. Co‐expression networks of the 11015 samples showed remarkably well‐separated distinct clusters for which we were able to extract the expression profiles of a group of unique OFGTs per cancer type. Network analysis and expression profiles allowed the quantification of similarity and dissimilarity between cancer types. OGFGTs were also able to contrast normal and tumor phenotypes in 1169 matched samples with an accuracy of 94% and an odds ratio of the logistic model on blind testing data of 196 (p‐value of Chi‐square association test<0.0001). Specifically, we determined a minimum set of genes (26 genes) required to distinguish normal cells from cancer cells using the Akaike information criterion. In addition, upon separating cancer samples into EMT pos (2317 samples) and EMT neg (6925 samples) based on the expression of the N‐Cadherin, E‐cadherin, ZEB1 and vimentin, OGFGTs were able to classify the samples with an accuracy of 92.5%. Thus our model identifies the OGFGTs associated with the EMT program with an odds ratio of 99.2 (p‐value of Chi‐square association test<0.0001). Furthermore, using CD44 and CD133 expression to identify CSCs (2417 samples/9242 total samples), OGFGTs showed significant association with CSC with an odds ratio of 16.1 (p‐value of Chi‐square association test<0.0001) and 86% accuracy. Overall this supports the central role of glycosylation in a multitude of cancers. These OGFGTs were able to extraordinarily classify types of cancers as well as subtypes of cancers with great precision, which, in a clinical setting, could be used to select an appropriate treatment regime for a patient as well as to monitor their responses to specific cancer treatments. Support or Funding Information This work was supported by the King Abdullah University of Science and Technology (KAUST) Faculty Baseline Research Funding Program as well as a Competitive Research Grant to J.S.M.