AccScience Publishing / CP / Online First / DOI: 10.36922/CP025510090
ORIGINAL RESEARCH ARTICLE

Prognostic factors and machine learning models for metastatic prostate cancer survival: Insights from the SEER database

Xian-Yong Yan1 Jing-Zhi Huang1 Qing-Fan Wei1 Rong-Yuan Wei1 Jiang Qin1 Hai Lu1* Jie Lan2*
Show Less
1 Department of Urology, The First Affiliated Hospital of Guangxi Medical University Hechi Hospital, Hechi, Guangxi Zhuang Autonomous Region, China
2 Department of Surgery, Hechi Maternal and Child Health Hospital, Hechi, Guangxi Zhuang Autonomous Region, China
Received: 18 December 2025 | Revised: 24 January 2026 | Accepted: 25 March 2026 | Published online: 8 May 2026
© 2026 by the Author(s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0/ )
Abstract

Metastatic prostate cancer remains a significant health concern, necessitating the use of accurate prognostic tools to predict patient outcomes. This study aims to develop a novel survival prediction model for patients with metastatic prostate cancer. Data for patients diagnosed with metastatic prostate cancer between 2010 and 2020 were sourced from the Surveillance, Epidemiology, and End Results database. Independent prognostic features were identified using the least absolute shrinkage and selection operator (LASSO)-Cox regression. Survival models, including XGBoost, RandomForestSRC, CoxBoost, SuperPC, and the DeepSurv model, were constructed, followed by an assessment of model predictive accuracy for 1-, 3-, and 5-year overall survival (OS) using the area under the receiver operating characteristic curve (AUC). The optimal model was selected based on the highest AUC, followed by feature importance evaluation and Kaplan–Meier survival analysis. A total of 10,408 eligible patients were enrolled, with 15 prognostic features identified by LASSO-Cox regression. The DeepSurv model showed the best overall performance, with a concordance index of 0.6488 and AUC values of 0.696, 0.685, and 0.720 for 1-, 3-, and 5-year OS, respectively. Gleason score and age were confirmed as core prognostic features across multiple algorithms. Survival analysis showed high-risk patients (risk score > median) had significantly shorter OS times than low-risk patients (risk score ≤ median; p < 0.001). The DeepSurv model achieves high OS prediction accuracy for metastatic prostate cancer, and Gleason score and age are key core prognostic factors for this population. This model enables reliable individualized risk stratification, which can enhance clinical decision-making and improve patient outcomes.

Keywords
Metastatic prostate cancer
Machine learning
Deep learning
Survival model
SEER database
Funding
None.
Conflict of interest
The authors declare that they have no competing interests.
References
  1. Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74(3):229-263. doi: 10.3322/caac.21834
  2. Sandhu S, Moore CM, Chiong E, Beltran H, Bristow RG, Williams SG. Prostate cancer. Lancet. 2021;398(10305):1075- 1090. doi: 10.1016/S0140-6736(21)00950-8
  3. Arriaga JM, Ronaldson-Bouchard K, Picech F, et al. In vivo genome-wide CRISPR screening identifies CITED2 as a driver of prostate cancer bone metastasis. Oncogene. 2024;43(17):1303-1315. doi: 10.1038/s41388-024-02995-5
  4. Pan J, Tong F, Ren N, et al. Role of N(6)‑methyladenosine in the pathogenesis, diagnosis and treatment of prostate cancer (Review). Oncol Rep. 2024;51(6). doi: 10.3892/or.2024.8747
  5. Posdzich P, Darr C, Hilser T, et al. Metastatic prostate cancer—A review of current treatment options and promising new approaches. Cancers. 2023;15(2):461. doi: 10.3390/cancers15020461
  6. Shah N, Ioffe V. Early detection of prostate cancer: AUA/ SUO guideline part I: Prostate cancer screening. Letter. J Urol. 2023;210(5): 731. doi: 10.1097/JU.0000000000003682
  7. Cao G, Li Y, Wang J, et al. Gleason score, surgical and distant metastasis are associated with cancer-specific survival and overall survival in middle aged high-risk prostate cancer: A population-based study. Front Public Health. 2022;10:1028905. doi: 10.3389/fpubh.2022.1028905
  8. Liu D, Kuai Y, Zhu R, et al. Prognosis of prostate cancer and bone metastasis pattern of patients: a SEER-based study and a local hospital based study from China. Sci Rep. 2020;10(1):9104. doi: 10.1038/s41598-020-64073-6
  9. Zelic R, Garmo H, Zugna D, et al. Predicting prostate cancer death with different pretreatment risk stratification tools: A head-to-head comparison in a nationwide cohort study. Eur Urol. 2020;77(2):180-188. doi: 10.1016/j.eururo.2019.09.027
  10. Gnanapragasam VJ, Lophatananon A, Wright KA, Muir KR, Gavin A, Greenberg DC. Improving clinical risk stratification at diagnosis in primary prostate cancer: A prognostic modelling study. PLoS Med. 2016;13(8):e1002063. doi: 10.1371/journal.pmed.1002063
  11. Lee C, Light A, Alaa A, Thurtle D, Van Der Schaar M, Gnanapragasam VJ. Application of a novel machine learning framework for predicting non-metastatic prostate cancer-specific mortality in men using the Surveillance, Epidemiology, and End Results (SEER) database. Lancet Digit Health. 2021;3(3):e158-e165. doi: 10.1016/s2589-7500(20)30314-9
  12. Sanda MG, Cadeddu JA, Kirkby E, et al. Clinically Localized Prostate Cancer: AUA/ASTRO/SUO Guideline. Part I: Risk stratification, shared decision making, and care options. J Urol. 2017;199(3):683-690. doi: 10.1016/j.juro.2017.11.095
  13. Gravis G, Boher JM, Fizazi K, et al. Prognostic factors for survival in noncastrate metastatic prostate cancer: validation of the glass model and development of a novel simplified prognostic model. Eur Urol. 2014;68(2):196-204. doi: 10.1016/j.eururo.2014.09.022
  14. Li T, Huang H, Zhang S, et al. Predictive models based on machine learning for bone metastasis in patients with diagnosed colorectal cancer. Front Public Health. 2022;10:984750. doi: 10.3389/fpubh.2022.984750
  15. Warren JL, Klabunde CN, Schrag D, Bach PB, Riley GF. Overview of the SEER-Medicare data. Med Care. 2002;40(Supplement):IV-3. doi: 10.1097/00005650-200208001-00002
  16. Adamo M, Boten JA, Coyle LM, et al. Validation of prostate‐specific antigen laboratory values recorded in Surveillance, Epidemiology, and End Results registries. Cancer. 2016;123(4):697-703. doi: 10.1002/cncr.30401
  17. Yi X, Xu W, Tang G, et al. Individual risk and prognostic value prediction by machine learning for distant metastasis in pulmonary sarcomatoid carcinoma: a large cohort study based on the SEER database and the Chinese population. Front Oncol. 2023;13:1105224. doi: 10.3389/fonc.2023.1105224
  18. Camp RL, Dolled-Filhart M, Rimm DL. X-Tile. Clin Cancer Res. 2004;10(21):7252-7259. doi: 10.1158/1078-0432.ccr-04-0713
  19. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016. doi: 10.1145/2939672.2939785
  20. Ishwaran H, Kogalur UB, Kogalur MUB. Package ‘randomForestSRC’. Breast. 2022;6(1).
  21. Binder H. CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks [Computer software]. R package version 1.10. 2013;1(4).
  22. Bair E, Hastie T, Paul D, Tibshirani R. Prediction by supervised principal components. J Am Stat Assoc. 2006;101(473):119-137. doi: 10.1198/016214505000000628
  23. Bair E, Hastie T, Paul D, Tibshirani R. Prediction by supervised principal components. J Am Stat Assoc. 2006;101(473):119-137. doi: 10.1198/016214505000000628
  24. Blanche P, Blanche MP. Package ‘timeROC’. 2019, updated 2019–12–18. Available from: https://cran.r‑project.org/web/ packages/timeROC [Last accessed on 15 September 2024].
  25. Lundberg SM, Erion G, Chen H, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2(1):56-67. doi: 10.1038/s42256-019-0138-9
  26. Lin H, Zelterman D. Modeling survival data: extending the Cox model. 2002, Taylor & Francis.
  27. Achard V, Putora PM, Omlin A, Zilli T, Fischer S. Metastatic prostate cancer: treatment options. Oncology. 2021;100(1):48-59. doi: 10.1159/000519861
  28. Chen Q, Zhu X, Hu Y, et al. A study on the impact of marital status on the survival status of prostate cancer patients based on propensity score matching. Sci Rep. 2024;14(1):6162. doi: 10.1038/s41598-024-56145-8
  29. Li Y, Ma X, Guan C, Yang X. Analysis of the influence of marital status on prognosis of prostate cancer patients based on big data. Am J Clin Exp Urol. 2022;10(5):320-326.
  30. Bernard B, Burnett C, Sweeney CJ, Rider JR, Sridhar SS. Impact of age at diagnosis of de novo metastatic prostate cancer on survival. Cancer. 2019;126(5):986-993. doi: 10.1002/cncr.32630
  31. Humphreys MR, Fernandes KA, Sridhar SS. Impact of Age at Diagnosis on Outcomes in Men with Castrate-Resistant Prostate Cancer (CRPC). J Cancer. 2013;4(4):304-314. doi: 10.7150/jca.4192
  32. MacKintosh FR, Sprenkle PC, Walter LC, et al. Age and Prostate-Specific Antigen Level Prior to Diagnosis Predict Risk of Death from Prostate Cancer. Front Oncol. 2016;6:157. doi: 10.3389/fonc.2016.00157
  33. Scosyrev E, Messing EM, Mohile S, Golijanin D, Wu G. Prostate cancer in the elderly. Cancer. 2011;118(12):3062- 3070. doi: 10.1002/cncr.26392
  34. Wenzel M, Lutz M, Hoeh B, et al. Influence of tumor characteristics and time to metastatic disease on oncological outcomes in metachronous metastatic prostate cancer patients. Clin Genitourin Cancer. 2024;22(5):102158. doi: 10.1016/j.clgc.2024.102158
  35. Schober P, Vetter TR. Survival Analysis and Interpretation of Time-to-Event Data: The Tortoise and the Hare. Anesth Analg. 2018;127(3):792-798. doi: 10.1213/ane.0000000000003653
  36. George B, Seals S, Aban I. Survival analysis and regression models. J Nucl Cardiol. 2014;21(4):686-694. doi: 10.1007/s12350-014-9908-2
  37. Saito S, Sakamoto S, Higuchi K, et al. Machine-learning predicts time-series prognosis factors in metastatic prostate cancer patients treated with androgen deprivation therapy. Sci Rep. 2023;13(1):6325. doi: 10.1038/s41598-023-32987-6
  38. Vale-Silva LA, Rohr K. Long-term cancer survival prediction using multimodal deep learning. Sci Rep. 2021;11(1):13505. doi: 10.1038/s41598-021-92799-4
  39. Moradmand H, Aghamiri SMR, Ghaderi R, Emami H. The role of deep learning‐based survival model in improving survival prediction of patients with glioblastoma. Cancer Med. 2021;10(20):7048-7059. doi: 10.1002/cam4.4230
Share
Back to top
Cancer Plus, Electronic ISSN: 2661-3840 Print ISSN: 2661-3832, Published by AccScience Publishing