
A User-Friendly, Web-Based Integrative Tool (ESurv) for Survival Analysis: Development and Validation Study
Author(s) -
Kyoungjune Pak,
SaeOck Oh,
Tae Sik Goh,
Hye Jin Heo,
MyoungEun Han,
Dae Cheon Jeong,
ChiSeung Lee,
Hokeun Sun,
Junmo Kang,
Suji Choi,
SooHwan Lee,
Eun Jung Kwon,
Ji Wan Kang,
Yun Hak Kim
Publication year - 2020
Publication title -
jmir. journal of medical internet research/journal of medical internet research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.446
H-Index - 142
eISSN - 1439-4456
pISSN - 1438-8871
DOI - 10.2196/16084
Subject(s) - univariate , lasso (programming language) , proportional hazards model , elastic net regularization , computer science , receiver operating characteristic , survival analysis , bivariate analysis , oncology , data mining , bioinformatics , medicine , machine learning , feature selection , multivariate statistics , biology , world wide web
Background Prognostic genes or gene signatures have been widely used to predict patient survival and aid in making decisions pertaining to therapeutic actions. Although some web-based survival analysis tools have been developed, they have several limitations. Objective Taking these limitations into account, we developed ESurv (Easy, Effective, and Excellent Survival analysis tool), a web-based tool that can perform advanced survival analyses using user-derived data or data from The Cancer Genome Atlas (TCGA). Users can conduct univariate analyses and grouped variable selections using multiomics data from TCGA. Methods We used R to code survival analyses based on multiomics data from TCGA. To perform these analyses, we excluded patients and genes that had insufficient information. Clinical variables were classified as 0 and 1 when there were two categories (for example, chemotherapy: no or yes), and dummy variables were used where features had 3 or more outcomes (for example, with respect to laterality: right, left, or bilateral). Results Through univariate analyses, ESurv can identify the prognostic significance for single genes using the survival curve (median or optimal cutoff), area under the curve (AUC) with C statistics, and receiver operating characteristics (ROC). Users can obtain prognostic variable signatures based on multiomics data from clinical variables or grouped variable selections (lasso, elastic net regularization, and network-regularized high-dimensional Cox-regression) and select the same outputs as above. In addition, users can create custom gene signatures for specific cancers using various genes of interest. One of the most important functions of ESurv is that users can perform all survival analyses using their own data. Conclusions Using advanced statistical techniques suitable for high-dimensional data, including genetic data, and integrated survival analysis, ESurv overcomes the limitations of previous web-based tools and will help biomedical researchers easily perform complex survival analyses.