z-logo
open-access-imgOpen Access
A model for predicting prognosis in patients with esophageal squamous cell carcinoma based on joint representation learning
Author(s) -
Jun Yu,
Xiaoliu Wu,
Min Lv,
Yuanying Zhang,
Xiaomei Zhang,
Jintian Li,
Ming Zhu,
Jianfeng Huang,
Qin Zhang
Publication year - 2020
Publication title -
oncology letters
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.766
H-Index - 54
eISSN - 1792-1082
pISSN - 1792-1074
DOI - 10.3892/ol.2020.12250
Subject(s) - autoencoder , cluster analysis , proportional hazards model , univariate , dna methylation , oncology , artificial intelligence , biology , medicine , gene , computer science , machine learning , deep learning , multivariate statistics , gene expression , genetics
Esophageal squamous cell carcinoma (ESCC) is one of the deadliest cancer types with a poor prognosis due to the lack of symptoms in the early stages and a delayed diagnosis. The present study aimed to identify the risk factors significantly associated with prognosis and to search for novel effective diagnostic modalities for patients with early-stage ESCC. mRNA and methylation data of patients with ESCC and the corresponding clinical information were downloaded from The Cancer Genome Atlas (TCGA) database, and the representation features were screened using deep learning autoencoder. The univariate Cox regression model was used to select the prognosis-related features from the representation features. K-means clustering was used to cluster the TCGA samples. Support vector machine classifier was constructed based on the top 75 features mostly associated with the risk subgroups obtained from K-means clustering. Two ArrayExpress datasets were used to verify the reliability of the obtained risk subgroups. The differentially expressed genes and methylation genes (DEGs and DMGs) between the risk subgroups were analyzed, and pathway enrichment analysis was performed. A total of 500 representation features were produced. Using K-means clustering, the TCGA samples were clustered into two risk subgroups with significantly different overall survival rates. Joint multimodal representation strategy, which showed a good model fitness (C-index=0.760), outperformed early-fusion autoencoder strategy. The joint representation learning-based classification model had good robustness. A total of 1,107 DEGs and 199 DMGs were screened out between the two risk subgroups. The DEGs were involved in 70 pathways, the majority of which were correlated with metastasis and proliferation of various cancer types, including cytokine-cytokine receptor interaction, cell adhesion molecules PPAR signaling pathway, pathways in cancer, transcriptional misregulation in cancer and ECM-receptor interaction pathways. The two survival subgroups obtained via the joint representation learning-based model had good robustness, and had prognostic significance for patients with ESCC.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom