AccScience Publishing / AIH / Online First / DOI: 10.36922/aih.2591
ORIGINAL RESEARCH ARTICLE

Predicting mortality outcomes in individual COVID-19 patients using machine learning algorithms

Nikolaos Kourmpanis1* Joseph Liaskos1 Emmanouil Zoulias1 John Mantas1
Show Less
1 Laboratory of Health Informatics, Department of Public Health, Faculty of Nursing, National and Kapodistrian University of Athens, Athens, Greece
AIH 2024, 1(3), 31–52; https://doi.org/10.36922/aih.2591
Submitted: 30 December 2023 | Accepted: 9 May 2024 | Published: 22 July 2024
© 2024 by the Author (s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0/ )
Abstract

In late 2019, the COVID-19 disease emerged, caused by the SARS-CoV-2 virus, and has since spread worldwide, becoming a global pandemic and resulting in almost seven million deaths to date. In addressing this global crisis, artificial intelligence has played a crucial role, particularly through the development of predictive models using machine learning algorithms, which have been successfully applied to solving a multitude of problems across multiple scientific fields. The purpose of this paper is to identify the model, or models, with the highest accuracy in predicting a COVID-19 patient’s mortality outcome by comparing their performance metrics. Different ML methods employed in model development include logistic regression, decision trees, random forest, eXtreme gradient boosting (XGBoost), multi-layer perceptrons, and the k-nearest neighbors. The metrics used for the comparison of these models were accuracy, precision-recall, F1 score, area under the receiver operating characteristic curve (AUC-ROC), and runtime. The data used comprised the clinical characteristics and histories of 12,425,179 individuals who attended health facilities in Mexico. Following a comprehensive evaluation, the XGBoost model achieved the highest overall score across all metrics. It scored 93.76% in precision, 95.47% in recall, 91.13% in F1-score, 97.86% in AUC-ROC, and had a runtime of 6.67306 s. Therefore, XGBoost was determined to be the preferred method for predicting the mortality outcome of COVID-19 patients.

Keywords
COVID-19
Pandemic
Machine learning
Classification algorithm
Funding
None.
Conflict of interest
The authors declare that they have no competing interests.
References
  1. Coronavirus Disease (COVID-19). World Health Organization; 2023. Available from: https://www.who.int/ news-room/questions-and-answers/item/coronavirus-disease-covid-19 [Last accessed on 2023 Dec 13].

 

  1. Statement on the Second Meeting of the International Health Regulations (2005) Emergency Committee Regarding the Outbreak of Novel Coronavirus (2019-nCoV). World Health Organization; 2020. Available from: https://www.who.int/ news/item/30-01-2020-statement-on-the-second-meeting-of-the-international-health-regulations-(2005)-emergency-committee-regarding-the-outbreak-of-novel-coronavirus- (2019-ncov) [Last accessed on 2023 Dec 18].

 

  1. WHO Director-General’s Opening Remarks at the Media Briefing on COVID-19. World Health Organization; 2020. Available from: https://www.who.int/director-general/ speeches/detail/who-director-general-s-opening-remarks-at-the-media-briefing-on-covid-19---11-march-2020 [Last accessed on 2023 Dec 18.

 

  1. WHO Coronavirus (COVID-19) Dashboard. WHO Coronavirus (COVID-19) Dashboard With Vaccination Data. World Health Organization; 2023. Available from: https://covid19.who.int [Last accessed on 2023 Dec 18].

 

  1. Naming the Coronavirus Disease (COVID-19) and the Virus that Causes it. World Health Organization; 2023. Available from: https://www.who.int/emergencies/diseases/ novel-coronavirus-2019/technical-guidance/naming-the-coronavirus-disease-(covid-2019)-and-the-virus-that-causes-it [Last accessed on 2023 Dec 18].

 

  1. Machhi J, Herskovitz J, Senan AM, et al. The natural history, pathobiology, and clinical manifestations of SARS-CoV-2 infections. J Neuroimmune Pharmacol. 2020;15(3):359-386. doi: 10.1007/s11481-020-09944-5

 

  1. Zhou P, Yang XL, Wang XG, et al. Addendum: A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;588(7836):E6. doi: 10.1038/s41586-020-2951-z

 

  1. Hoffmann M, Kleine-Weber H, Schroeder S, et al. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell. 2020;181(2):271-280.e8. doi: 10.1016/j.cell.2020.02.052

 

  1. Lednicky JA, Tagliamonte MS, White SK, et al. Independent infections of porcine deltacoronavirus among Haitian children. Nature. 2021;600(7887):133-137. doi: 10.1038/s41586-021-04111-z

 

  1. Vlasova AN, Diaz A, Damtie D, et al. Novel canine coronavirus isolated from a hospitalized patient with pneumonia in east Malaysia. Clin Infect Dis. 2022;74(3):446-454. doi: 10.1093/cid/ciab456

 

  1. Lytras S, Hughes J, Martin D, et al. Exploring the natural origins of SARS-CoV-2 in the light of recombination. Genome Biol Evol. 2022;14(2):evac018. doi: 10.1093/gbe/evac018

 

  1. Zhou H, Ji J, Chen X, et al. Identification of novel bat coronaviruses sheds light on the evolutionary origins of SARS-CoV-2 and related viruses. Cell. 2021;184(17):4380-4391.e14. doi: 10.1016/j.cell.2021.06.008

 

  1. Wacharapluesadee S, Tan CW, Maneeorn P, et al. Evidence for SARS-CoV-2 related coronaviruses circulating in bats and pangolins in Southeast Asia. Nat Commun. 2021;12(1):972. doi: 10.1038/s41467-021-21240-1

 

  1. Mitchell TM. Machine Learning. McGraw-Hill Science/ Engineering/Math; 1997. Available from: https://www. cin.ufpe.br/~cavmj/Machine%20-%20Learning%20-%20 Tom%20Mitchell.pdf [Last accessed on 2023 Dec 18].

 

  1. Zoabi Y, Deri-Rozov S, Shomron N. Machine learning-based prediction of COVID-19 diagnosis based on symptoms. NPJ Digit Med. 2021;4(1):3. doi: 10.1038/s41746-020-00372-6

 

  1. Aljameel SS, Khan IU, Aslam N, Aljabri M, Alsulmi ES. Machine learning-based model to predict the disease severity and outcome in COVID-19 patients. Sci Program. 2021;2021:1-10. doi: 10.1155/2021/5587188

 

  1. Mullick B, Magar R, Jhunjhunwala A, Barati Farimani A. Understanding mutation hotspots for the SARS-CoV-2 spike protein using Shannon Entropy and K-means clustering. Comput Biol Med. 2021;138:104915. doi: 10.1016/j.compbiomed.2021.104915

 

  1. Ozger ZB, Cihan P. A novel ensemble fuzzy classification model in SARS-CoV-2 B-cell epitope identification for development of protein-based vaccine. Appl Soft Comput. 2022;116:108280. doi: 10.1016/j.asoc.2021.108280

 

  1. People with Certain Medical Conditions. Centers for Disease Control and Prevention; 2023. Available from: https://www. cdc.gov/coronavirus/2019-ncov/need-extra-precautions/ people-with-medical-conditions.html [Last accessed on 2023 Dec 18].

 

  1. Yang X, Yu Y, Xu J, et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: A single-centered, retrospective, observational study. Lancet Respir Med. 2020;8(5):475-481. doi: 10.1016/S2213-2600(20)30079-5

 

  1. Guan WJ, Ni ZY, Hu Y, et al. Clinical characteristics of coronavirus disease 2019 in China. N Engl J Med. 2020;382(18):1708-1720. doi: 10.1056/NEJMoa2002032

 

  1. Cakir Edis E. Chronic pulmonary diseases and COVID-19. Turk Thorac J. 2020;21(5):345-349. doi: 10.5152/TurkThoracJ.2020.20091

 

  1. Goumenou M, Sarigiannis D, Tsatsakis A, et al. COVID19 in Northern Italy: An integrative overview of factors possibly influencing the sharp increase of the outbreak (Review). Mol Med Rep. 2020;22:20-32. doi: 10.3892/mmr.2020.11079

 

  1. Brake SJ, Barnsley K, Lu W, McAlinden KD, Eapen MS, Sohal SS. smoking upregulates angiotensin-converting enzyme-2 receptor: A potential adhesion site for novel coronavirus SARS-CoV-2 (Covid-19). J Clin Med. 2020;9(3):841. doi: 10.3390/jcm9030841

 

  1. Lewis T. Smoking or Vaping May Increase the Risk of a Severe Coronavirus Infection. Scientific American; 2020. Available from: https://www.scientificamerican.com/article/smoking-or-vaping-may-increase-the-risk-of-a-severe-coronavirus-infection1 [Last accessed on 2023 Dec 18].

 

  1. Datos Abiertos Dirección General de Epidemiología. Secretaría de Salud. Gobierno. Cobierno de Mexico; 2023. Available from: https://www.gob.mx/salud/documentos/ datos-abiertos-152127 [Last accessed on 2023 Dec 18].

 

  1. Cramer JS. The origins of logistic regression. SSRN Electron J. 2005. doi: 10.2139/ssrn.360300

 

  1. Logistic Regression in Machine Learning - Javatpoint; 2021. Available from: https://www.javatpoint.com/logistic-regression-in-machine-learning [Last accessed on 2023 Dec 18].

 

  1. Utgoff PE. Incremental induction of decision trees. Mach Learn. 1989;4(2):161-186.

 

  1. Kotsiantis S. Decision trees: A recent overview. Artif Intell Rev. 2013;39(4):261-283. doi: 10.1007/s10462-011-9272-4

 

  1. Machine Learning Random Forest Algorithm - Javatpoint; 2021. https://www.javatpoint.com/machine-learning-random-forest-algorithm [Last accessed on 2023 Dec 18].

 

  1. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM; 2016. p. 785-794. doi: 10.1145/2939672.2939785

 

  1. Beale R, Jackson T. Neural Computing: An Introduction. England: Adam Hilger; 1990. doi: 10.1887/0852742622

 

  1. Bezdek JC. On the relationship between neural networks, pattern recognition and intelligence. Int J Approx Reason. 1992;6(2):85-107. doi: 10.1016/0888-613X(92)90013-P

 

  1. Fix E, Hodges JL. Discriminatory analysis. Nonparametric discrimination: Consistency properties. Int Stat Rev/Rev Int Stat. 1989;57(3):238. doi: 10.2307/1403797

 

  1. Fitton D. Evaluating Models in Azure Machine Learning (Part 1: Classification). Adatis; 2020. Available from: https://adatis.co.uk/evaluating-models-in-azure-machine-learning-part-1-classification [Last accessed on 2023 Dec 18].

 

  1. Classification: ROC Curve and AUC. Machine Learning. Google for Developers. Google Machine Learning Education; 2022. Available from: https://developers.google.com/ machine-learning/crash-course/classification/roc-and-auc [Last accessed on 2023 Dec 18]

 

  1. Josephus BO, Nawir AH, Wijaya E, Moniaga JV, Ohyver M. Predict mortality in patients infected with COVID-19 virus based on observed characteristics of the patient using logistic regression. Procedia Comput Sci. 2021;179:871-877. doi: 10.1016/j.procs.2021.01.076

 

  1. Yan L, Zhang HT, Goncalves J, et al. An interpretable mortality prediction model for COVID-19 patients. Nat Mach Intell. 2020;2(5):283-288. doi: 10.1038/s42256-020-0180-7

 

  1. Pourhomayoun M, Shakibi M. Predicting mortality risk in patients with COVID-19 using machine learning to help medical decision-making. Smart Health. 2021;20:100178. doi: 10.1016/j.smhl.2020.100178

 

  1. Naseem M, Arshad H, Hashmi SA, Irfan F, Ahmed FS. Predicting mortality in SARS-COV-2 (COVID-19) positive patients in the inpatient setting using a novel deep neural network. Int J Med Inform. 2021;154:104556. doi: 10.1016/j.ijmedinf.2021.104556

 

  1. Chadaga K, Prabhu S, Umakanth S, et al. COVID-19 mortality prediction among patients using epidemiological parameters: An ensemble machine learning approach. Eng Sci. 2021;16:221-33. doi: 10.30919/es8d579

 

  1. Franklin MR. Mexico COVID-19 Clinical Data; 2019. Available from: https://www.kaggle.com/datasets/marianarfranklin/ mexico-covid19-clinical-data [Last accessed on 2023 Dec 18].

 

  1. Rai N, Kaushik N, Kumar D, Raj C, Ali A. Mortality prediction of COVID-19 patients using soft voting classifier. Int J Cogn Comput Eng. 2022;3:172-179. doi: 10.1016/j.ijcce.2022.09.001

 

  1. Bárcenas R, Fuentes-García R. Risk assessment in COVID- 19 patients: A multiclass classification approach. Inform Med Unlocked. 2022;32:101023. doi: 10.1016/j.imu.2022.101023

 

  1. Al-Shaikh A, Mahafzah BA, Alshraideh M. Hybrid harmony search algorithm for social network contact tracing of COVID-19. Soft Comput. 2023;27(6):3343-3365. doi: 10.1007/s00500-021-05948-2

 

  1. Mandala SK. Unveiling the unborn: Advancing fetal health classification through machine learning. Artif Intell Health. 2023;1(1):2121. doi: 10.36922/aih.2121

 

  1. Al-Tawil M, Mahafzah BA, Al Tawil A, Aljarah I. Bio-inspired machine learning approach to type 2 diabetes detection. Symmetry (Basel). 2023;15(3):764. doi: 10.3390/sym15030764

 

  1. Umar BU, Ajao LA, Dogo EM, Ajao FJ, Atama M. Artificial intelligence model for prediction of cardiovascular disease: An empirical study. Artif Intell Health. 2023;1(1):1746. doi: 10.36922/aih.1746

 

  1. Chawla NV, Bowyer KW, Hall LO, Philip Kegelmeyer W. SMOTE: Synthetic minority over-sampling technique. J Artif Intell Res. 2002;30(2):321-357.

 

  1. Rosenblatt F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol Rev. 1958;65(6):386-408. doi: 10.1037/h0042519

 

  1. Abuqaddom I, Mahafzah BA, Faris H. Oriented stochastic loss descent algorithm to train very deep multi-layer neural networks without vanishing gradients. Knowl Based Syst. 2021;230:107391. doi: 10.1016/j.knosys.2021.107391

 

  1. Neural Network Models (supervised); 2021. Available from: https://scikit-learn.org/stable/modules/neural_networks_ supervised.html [Last accessed on 2023 Dec 18].

 

  1. Cover TM, Hart PE. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21-27. doi: 10.1109/TIT.1967.1053964

 

  1. Kubat M. An Introduction to Machine Learning. Berlin: Springer; 2017. doi: 10.1007/978-3-319-63913-0

 

  1. Glossary of Common Terms and API; 2007. Available from: https://scikit-learn.org/stable/glossary.html#term-feature_ importances [Last accessed on 2023 Dec 18].
Share
Back to top
Artificial Intelligence in Health, Electronic ISSN: 3029-2387 Print ISSN: 3041-0894, Published by AccScience Publishing