TY - JOUR
T1 - Identifying mortality risk factors amongst acute coronary syndrome patients admitted to Arabian Gulf hospitals using machine-learning methods
AU - Raza, Syed Asif
AU - Thalib, Lukman
AU - Al Suwaidi, Jassim
AU - Sulaiman, Kadhim
AU - Almahmeed, Wael
AU - Amin, Haitham
AU - AlHabib, Khalid F.
N1 - Publisher Copyright:
© 2019 John Wiley & Sons, Ltd
PY - 2019
Y1 - 2019
N2 - Acute coronary syndrome (ACS) is a leading cause of mortality and morbidity in the Arabian Gulf. In this study, the in-hospital mortality amongst patients admitted with ACS to Arabian Gulf hospitals is predicted using a comprehensive modelling framework that combines powerful machine-learning methods such as support-vector machine (SVM), Naïve Bayes (NB), artificial neural networks (NN), and decision trees (DT). The performance of the machine-learning methods is compared with that of the performance of a commonly used statistical method, namely, logistic regression (LR). The study follows the current practise of computing mortality risk using risk scores such as the Global Registry of Acute Coronary Events (GRACE) score, which has not been validated for Arabian Gulf patients. Cardiac registry data of 7,000 patients from 65 hospitals located in Arabian Gulf countries are used for the study. This study is unique as it uses a contemporary data analytics framework. A k-fold (k = 10) cross-validation is utilized to generate training and validation samples from the GRACE dataset. The machine-learning-based predictive models often incur prejudgments for imbalanced training data patterns. To mitigate the data imbalance due to scarce observations for in-hospital mortalities, we have utilized specialized methods such as random undersampling (RUS) and synthetic minority over sampling technique (SMOTE). A detailed simulation experimentation is carried out to build models with each of the five predictive methods (LR, NN, NB, SVM, and DT) for the each of the three datasets k-fold subsamples generated. The predictive models are developed under three schemes of the k-fold samples that include no data imbalance, RUS, and SMOTE. We have implemented an information fusion method rooted in computing weighted impact scores obtain for an individual medical history attributes from each of the predictive models simulated for a collective recommendation based on an impact score specific to a predictor. Finally, we grouped the predictors using fuzzy c-mean clustering method into three categories, high-, medium-, and low-risk factors for in-hospital mortality due to ACS. Our study revealed that patients with medical history related to the presences of peripheral artery disease, congestive heart failure, cardiovascular transient ischemic attack valvular disease, and coronary artery bypass grafting amongst others have the most risk for in-hospital mortality.
AB - Acute coronary syndrome (ACS) is a leading cause of mortality and morbidity in the Arabian Gulf. In this study, the in-hospital mortality amongst patients admitted with ACS to Arabian Gulf hospitals is predicted using a comprehensive modelling framework that combines powerful machine-learning methods such as support-vector machine (SVM), Naïve Bayes (NB), artificial neural networks (NN), and decision trees (DT). The performance of the machine-learning methods is compared with that of the performance of a commonly used statistical method, namely, logistic regression (LR). The study follows the current practise of computing mortality risk using risk scores such as the Global Registry of Acute Coronary Events (GRACE) score, which has not been validated for Arabian Gulf patients. Cardiac registry data of 7,000 patients from 65 hospitals located in Arabian Gulf countries are used for the study. This study is unique as it uses a contemporary data analytics framework. A k-fold (k = 10) cross-validation is utilized to generate training and validation samples from the GRACE dataset. The machine-learning-based predictive models often incur prejudgments for imbalanced training data patterns. To mitigate the data imbalance due to scarce observations for in-hospital mortalities, we have utilized specialized methods such as random undersampling (RUS) and synthetic minority over sampling technique (SMOTE). A detailed simulation experimentation is carried out to build models with each of the five predictive methods (LR, NN, NB, SVM, and DT) for the each of the three datasets k-fold subsamples generated. The predictive models are developed under three schemes of the k-fold samples that include no data imbalance, RUS, and SMOTE. We have implemented an information fusion method rooted in computing weighted impact scores obtain for an individual medical history attributes from each of the predictive models simulated for a collective recommendation based on an impact score specific to a predictor. Finally, we grouped the predictors using fuzzy c-mean clustering method into three categories, high-, medium-, and low-risk factors for in-hospital mortality due to ACS. Our study revealed that patients with medical history related to the presences of peripheral artery disease, congestive heart failure, cardiovascular transient ischemic attack valvular disease, and coronary artery bypass grafting amongst others have the most risk for in-hospital mortality.
KW - Global Registry of Acute Coronary Events (GRACE) risk score
KW - Naïve Bayes
KW - acute coronary syndrome (ACS)
KW - decision tree
KW - fuzzy c-mean clustering
KW - imbalance data
KW - information fusion
KW - logistic regression
KW - machine learning
KW - mortality
KW - neural networks
KW - predictive analytics
KW - support-vector machine
UR - http://www.scopus.com/inward/record.url?scp=85065470281&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065470281&partnerID=8YFLogxK
U2 - 10.1111/exsy.12413
DO - 10.1111/exsy.12413
M3 - Article
AN - SCOPUS:85065470281
SN - 0266-4720
VL - 36
JO - Expert Systems
JF - Expert Systems
IS - 4
M1 - e12413
ER -