Comparative study of transformer-based models for cardiovascular disease risk stratification with tabular biomarker data
Cardiovascular disease (CVD) remains one of the leading causes of global mortality and disability. Advances in computational modeling and artificial intelligence have enhanced CVD risk prediction by integrating multivariate clinical and biochemical features. Recent developments in deep learning, especially transformer-based models for tabular data, have demonstrated superior capabilities in capturing nonlinear and high-dimensional biomarker interactions. This study proposes a predictive framework that uses statistical and clinical biomarker data to assess CVD risk. Traditional machine learning models (logistic regression, Gaussian Naïve Bayes, linear discriminant analysis, AdaBoost, and XGBoost) were compared with deep learning models (gated recurrent unit [GRU] and long short-term memory [LSTM]) and transformer-based models—self-attention and intersample attention transformer (SAINT), feature tokenizer (FT), and tab transformer. Experiments were conducted to evaluate the impact of data augmentation, analyze learning behavior through loss and accuracy curves, and assess a fusion approach combining tab transformer with recurrent networks. Model performance was evaluated using accuracy and receiver operating characteristic analysis. Transformer-based models consistently outperformed conventional machine learning and deep learning methods. SAINT and FT achieved an area under the curve (AUC) of 0.8875 and 0.9489, respectively. The tab transformer demonstrated the highest performance with an AUC of 0.9728. The fusion of the tab transformer with GRU and LSTM further enhanced predictive precision, improving representation learning and generalization for CVD risk prediction. The proposed transformer-based framework offers a robust, scalable, and interpretable solution for accurate CVD risk assessment. Its superior predictive capability highlights the potential for integration into clinical decision-support systems for early diagnosis and patient management.

- Aghaei Zarch SM, Dehghan Tezerjani M, Talebi M, Vahidi Mehrjardi MY. Molecular biomarkers in diabetes mellitus (DM). Med J Islam Repub Iran. 2020;34:28. doi: 10.34171/mjiri.34.28
- Kalantar-Zadeh K, Jafar TH, Nitsch D, Neuen BL, Perkovic V. Chronic kidney disease. Lancet. 2021;398(10302):786-802. doi: 10.1016/S0140-6736(21)00519-5
- Dai H, Wu S, Huang J, et al. FT-Transformer: Resilient and Reliable Transformer with End-to-End Fault Tolerant Attention. [arXiv Preprint]; 2025. doi: 10.48550/ARXIV.2504.02211
- Somepalli G, Goldblum M, Schwarzschild A, Bruss CB, Goldstein T. SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training. [arXiv Preprint]; 2021. doi: 10.48550/ARXIV.2106.01342
- Rahmanzadehgervi P, Nguyen HH, Liu R, Mai L, Nguyen AT. TAB: Transformer Attention Bottlenecks Enable User Intervention and Debugging in Vision-Language Models. [arXiv Preprint]; 2024. doi: 10.48550/ARXIV.2412.18675
- Kim BJ, Nam IW. A review of hybrid LSTM models in smart cities. Processes. 2025;13(7):2298. doi: 10.3390/pr13072298
- Baibulova M, Aitimov M, Burganova R, et al. A hybrid CNN-GRU-LSTM algorithm with SHAP-based interpretability for EEG-based ADHD diagnosis. Algorithms. 2025;18(8):453. doi: 10.3390/a18080453
- Zhao S, Wei H, Zhang K. Deep Bidirectional GRU Networkfor Human Activity Recognition using Wearable Inertial Sensors. In: 2022 3rd International Conference on Electronic Communication and Artificial Intelligence (IWECAI). IEEE; 2022. p. 238-242. doi: 10.1109/IWECAI55315.2022.00054
- Chen JX, Jiang DM, Zhang YN. A hierarchical bidirectional GRU model with attention for EEG-based emotion classification. IEEE Access. 2019;7:118530-118540. doi: 10.1109/ACCESS.2019.2936817
- Singh J, Khanna NN, Rout RK, et al. GeneAI 3.0: Powerful, novel, generalized hybrid and ensemble deep learning frameworks for miRNA species classification of stationary patterns from nucleotides. Sci Rep. 2024;14(1):7154. doi: 10.1038/s41598-024-56786-9
- Johri AM, Singh KV, Mantella LE, et al. Deep learning artificial intelligence framework for multiclass coronary artery disease prediction using combination of conventional risk factors, carotid ultrasound, and intraplaque neovascularization. Comput Biol Med. 2022;150:106018. doi: 10.1016/j.compbiomed.2022.106018
- Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(56):1929-1958.
- Lin L, Ding L, Fu Z, Zhang L. Machine learning-based models for prediction of the risk of stroke in coronary artery disease patients receiving coronary revascularization. PLoS One. 2024;19(2):e0296402. doi: 10.1371/journal.pone.0296402
- Teji JS, Jain S, Gupta SK, Suri JS. NeoAI 1.0: Machine learning-based paradigm for prediction of neonatal and infant risk of death. Comput Biol Med. 2022;147:105639. doi: 10.1016/j.compbiomed.2022.105639
- Sharma Y, Gupta S, Gupta N, et al. StockAI 3.0: Ensemble fusion paradigms using novel gating mechanism in long short-term memory architectures for forecasting sentiment-based stock trends. Soft Comput. 2025;29(21-22):5803-5829. doi: 10.1007/s00500-025-10901-8
- Xu W, He J, Li W, et al. Long-short-term-memory-based deep stacked sequence-to-sequence autoencoder for health prediction of industrial workers in closed environments based on wearable devices. Sensors (Basel). 2023;23(18):7874. doi: 10.3390/s23187874
- Beck M, Pöppel K, Spanring M, et al. xLSTM: Extended Long Short-Term Memory. [arXiv Preprint]; 2024. doi: 10.48550/ARXIV.2405.04517
- Wilson PWF, D’Agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB. Prediction of coronary heart disease using risk factor categories. Circulation. 1998;97(18):1837-1847. doi: 10.1161/01.CIR.97.18.1837
- Conroy R. Estimation of ten-year risk of fatal cardiovascular disease in Europe: The SCORE project. Eur Heart J. 2003;24(11):987-1003. doi: 10.1016/S0195-668X(03)00114-3
- Goff DC, Lloyd-Jones DM, Bennett G, et al. 2013 ACC/AHA guideline on the assessment of cardiovascular risk: A report of the American college of cardiology/American heart association task force on practice guidelines. Circulation. 2014;129(25 Suppl 2):S49-S73. doi: 10.1161/01.cir.0000437741.48606.98
- Detrano R, Guerci AD, Carr JJ, et al. Coronary calcium as a predictor of coronary events in four racial or ethnic groups. N Engl J Med. 2008;358(13):1336-1345. doi: 10.1056/NEJMoa072100
- Poplin R, Varadarajan AV, Blumer K, et al. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nat Biomed Eng. 2018;2(3):158-164. doi: 10.1038/s41551-018-0195-0
- Hossain MM, Ali MS, Ahmed MM, et al. Cardiovascular disease identification using a hybrid CNN-LSTM model with explainable AI. Inform Med Unlocked. 2023;42:101370. doi: 10.1016/j.imu.2023.101370
- Pattanayak S, Singh T. Cardiovascular disease classification based on machine learning algorithms using grid search CV, cross validation and stacked ensemble methods. In: Singh M, Tyagi V, Gupta PK, Flusser J, Ören T, editors. Advances in Computing and Data Sciences. Berlin: Springer International Publishing; 2022. p. 219-230. doi: 10.1007/978-3-031-12638-3_19
- Konstantonis G, Singh KV, Sfikakis PP, et al. Cardiovascular disease detection using machine learning and carotid/ femoral arterial imaging frameworks in rheumatoid arthritis patients. Rheumatol Int. 2022;42(2):215-239. doi: 10.1007/s00296-021-05062-4
- Blagus R, Lusa L. SMOTE for high-dimensional class-imbalanced data. BMC Bioinformatics. 2013;14(1):106. doi: 10.1186/1471-2105-14-106
- Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. JAIR. 2002;16:321-357. doi: 10.1613/jair.953
- Maiga J, Hungilo GG, Pranowo P. Comparison of Machine Learning Models in Prediction of Cardiovascular Disease Using Health Record Data. In: 2019 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS); 2019. p. 45-48. doi: 10.1109/ICIMCIS48181.2019.8985205
- Shorewala V. Early detection of coronary heart disease using ensemble techniques. Inform Med Unlocked. 2021;26:100655. doi: 10.1016/j.imu.2021.100655
- Unnikrishnan P, Kumar DK, Poosapadi Arjunan S, Kumar H, Mitchell P, Kawasaki R. Development of health parameter model for risk prediction of CVD using SVM. Comput Math Methods Med. 2016;2016:3016245. doi: 10.1155/2016/3016245
- Alaa AM, Bolton T, Di Angelantonio E, Rudd JH, Van Der Schaar M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One. 2019;14(5):e0213653. doi: 10.1371/journal.pone.0213653
- Reddy KVV, Elamvazuthi I, Aziz AA, Paramasivam S, Chua HN, Pranavanand S. Heart disease risk prediction using machine learning classifiers with attribute evaluators. Appl Sci. 2021;11(18):8352. doi: 10.3390/app11188352
- Bhatt CM, Patel P, Ghetia T, Mazzeo PL. Effective heart disease prediction using machine learning techniques. Algorithms. 2023;16(2):88. doi: 10.3390/a16020088
- Jain PK, Sharma N, Giannopoulos AA, Saba L, Nicolaides A, Suri JS. Hybrid deep learning segmentation models for atherosclerotic plaque in internal carotid artery B-mode ultrasound. Comput Biol Med. 2021;136:104721. doi: 10.1016/j.compbiomed.2021.104721
- Rahman AU, Alsenani Y, Zafar A, Ullah K, Rabie K, Shongwe T. Enhancing heart disease prediction using a self-attention-based transformer model. Sci Rep. 2024;14(1):514. doi: 10.1038/s41598-024-51184-7
- Jamthikar AD, Gupta D, Mantella LE, et al. Multiclass machine learning vs. Conventional calculators for stroke/CVD risk assessment using carotid plaque predictors with coronary angiography scores as gold standard: A 500 participants study. Int J Cardiovasc Imaging. 2021;37(4):1171-1187. doi: 10.1007/s10554-020-02099-7
- Usama M, Ahmad B, Xiao W, Hossain MS, Muhammad G. Self-attention based recurrent convolutional neural network for disease prediction using healthcare data. Computer Methods Programs Biomed. 2020;190:105191. doi: 10.1016/j.cmpb.2019.105191
- Suri JS, Bhagawati M, Paul S, et al. A powerful paradigm for cardiovascular risk stratification using multiclass, multi-label, and ensemble-based machine learning paradigms: A narrative review. Diagnostics (Basel). 2022;12(3):722. doi: 10.3390/diagnostics12030722
- Yeom SK, Seegerer P, Lapuschkin S, et al. Pruning by explaining: A novel criterion for deep neural network pruning. Pattern Recogn. 2021;115:107899. doi: 10.1016/j.patcog.2021.107899
- Sabih M, Hannig F, Teich J. Utilizing Explainable AI for Quantization and Pruning of Deep Neural Networks. [arXiv Preprint]; 2020. doi: 10.48550/ARXIV.2008.09072
- Ngiam J, Khosla A, Kim M, Nam J, Lee H, Ng A. Multimodal Deep Learning. In: Proceedings of the 28th International Conference on Machine Learning, ICML 2011. 2011:689–696.
