Two-step machine learning to diagnose and predict involvement of lungs in COVID-19 and pneumonia using CT radiomics

Pegah Moradi Khaniabadi; Yassine Bouchareb; Humoud Al-Dhuhli; Isaac Shiri; Faiza Al-Kindi; Bita Moradi Khaniabadi; Habib Zaidi; Arman Rahmim

doi:10.1016/j.compbiomed.2022.106165

Two-step machine learning to diagnose and predict involvement of lungs in COVID-19 and pneumonia using CT radiomics

Pegah Moradi Khaniabadi^*, Yassine Bouchareb^*, Humoud Al-Dhuhli, Isaac Shiri, Faiza Al-Kindi, Bita Moradi Khaniabadi, Habib Zaidi, Arman Rahmim

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

21 Citations (Scopus)

Abstract

Objective: To develop a two-step machine learning (ML) based model to diagnose and predict involvement of
lungs in COVID-19 and non COVID-19 pneumonia patients using CT chest radiomic features.
Methods: Three hundred CT scans (3-classes: 100 COVID-19, 100 pneumonia, and 100 healthy subjects) were
enrolled in this study. Diagnostic task included 3-class classification. Severity prediction score for COVID-19 and
pneumonia was considered as mild (0-25%), moderate (26-50%), and severe (>50%). Whole lungs were
segmented utilizing deep learning-based segmentation. Altogether, 107 features including shape, first-order
histogram, second and high order texture features were extracted. Pearson correlation coefficient (PCC≥90%)
followed by different features selection algorithms were employed. ML-based supervised algorithms (Naïve Bays,
Support Vector Machine, Bagging, Random Forest, K-nearest neighbors, Decision Tree and Ensemble Meta
voting) were utilized. The optimal model was selected based on precision, recall and area-under-curve (AUC) by
randomizing the training/validation, followed by testing using the test set.
Results: Nine pertinent features (2 shape, 1 first-order, and 6 second-order) were obtained after features selection
for both phases. In diagnostic task, the performance of 3-class classification using Random Forest was 0.909
±0.026, 0.907±0.056, 0.902±0.044, 0.939±0.031, and 0.982±0.010 for precision, recall, F1-score, accuracy,
and AUC, respectively. The severity prediction task using Random Forest achieved 0.868±0.123 precision, 0.865
±0.121 recall, 0.853±0.139 F1-score, 0.934±0.024 accuracy, and 0.969±0.022 AUC.
Conclusion: The two-phase ML-based model accurately classified COVID-19 and pneumonia patients using CT
radiomics, and adequately predicted severity of lungs involvement. This 2-steps model showed great potential in
assessing COVID-19 CT images towards improved management of patients.

Original language	English
Article number	106165
Number of pages	1
Journal	Computers in Biology and Medicine
Volume	150
DOIs	https://doi.org/10.1016/j.compbiomed.2022.106165
Publication status	Published - Nov 1 2022
Externally published	Yes

Keywords

COVID-19
CT images
Diagnosis
Machine learning
Pneumonia
Prediction
Radiomics

ASJC Scopus subject areas

Health Informatics
Computer Science Applications

Access to Document

10.1016/j.compbiomed.2022.106165

Cite this

@article{c042d8551bdc4c39a76dfd5b2b6c781f,

title = "Two-step machine learning to diagnose and predict involvement of lungs in COVID-19 and pneumonia using CT radiomics",

abstract = "Objective: To develop a two-step machine learning (ML) based model to diagnose and predict involvement oflungs in COVID-19 and non COVID-19 pneumonia patients using CT chest radiomic features.Methods: Three hundred CT scans (3-classes: 100 COVID-19, 100 pneumonia, and 100 healthy subjects) wereenrolled in this study. Diagnostic task included 3-class classification. Severity prediction score for COVID-19 andpneumonia was considered as mild (0-25%), moderate (26-50%), and severe (>50%). Whole lungs weresegmented utilizing deep learning-based segmentation. Altogether, 107 features including shape, first-orderhistogram, second and high order texture features were extracted. Pearson correlation coefficient (PCC≥90%)followed by different features selection algorithms were employed. ML-based supervised algorithms (Na{\"i}ve Bays,Support Vector Machine, Bagging, Random Forest, K-nearest neighbors, Decision Tree and Ensemble Metavoting) were utilized. The optimal model was selected based on precision, recall and area-under-curve (AUC) byrandomizing the training/validation, followed by testing using the test set.Results: Nine pertinent features (2 shape, 1 first-order, and 6 second-order) were obtained after features selectionfor both phases. In diagnostic task, the performance of 3-class classification using Random Forest was 0.909±0.026, 0.907±0.056, 0.902±0.044, 0.939±0.031, and 0.982±0.010 for precision, recall, F1-score, accuracy,and AUC, respectively. The severity prediction task using Random Forest achieved 0.868±0.123 precision, 0.865±0.121 recall, 0.853±0.139 F1-score, 0.934±0.024 accuracy, and 0.969±0.022 AUC.Conclusion: The two-phase ML-based model accurately classified COVID-19 and pneumonia patients using CTradiomics, and adequately predicted severity of lungs involvement. This 2-steps model showed great potential inassessing COVID-19 CT images towards improved management of patients.",

keywords = "COVID-19, CT images, Diagnosis, Machine learning, Pneumonia, Prediction, Radiomics",

author = "{Moradi Khaniabadi}, Pegah and Yassine Bouchareb and Humoud Al-Dhuhli and Isaac Shiri and Faiza Al-Kindi and {Moradi Khaniabadi}, Bita and Habib Zaidi and Arman Rahmim",

note = "Funding Information: This work was supported by the Omani Research Council, Oman Grant, grant number RC/COVID-MED/RADI/20/01 . Copyright {\textcopyright} 2022 Elsevier Ltd. All rights reserved. DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.",

year = "2022",

month = nov,

day = "1",

doi = "10.1016/j.compbiomed.2022.106165",

language = "English",

volume = "150",

journal = "Computers in Biology and Medicine",

issn = "0010-4825",

publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Two-step machine learning to diagnose and predict involvement of lungs in COVID-19 and pneumonia using CT radiomics

AU - Moradi Khaniabadi, Pegah

AU - Bouchareb, Yassine

AU - Al-Dhuhli, Humoud

AU - Shiri, Isaac

AU - Al-Kindi, Faiza

AU - Moradi Khaniabadi, Bita

AU - Zaidi, Habib

AU - Rahmim, Arman

N1 - Funding Information: This work was supported by the Omani Research Council, Oman Grant, grant number RC/COVID-MED/RADI/20/01 . Copyright © 2022 Elsevier Ltd. All rights reserved. DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.

PY - 2022/11/1

Y1 - 2022/11/1

N2 - Objective: To develop a two-step machine learning (ML) based model to diagnose and predict involvement oflungs in COVID-19 and non COVID-19 pneumonia patients using CT chest radiomic features.Methods: Three hundred CT scans (3-classes: 100 COVID-19, 100 pneumonia, and 100 healthy subjects) wereenrolled in this study. Diagnostic task included 3-class classification. Severity prediction score for COVID-19 andpneumonia was considered as mild (0-25%), moderate (26-50%), and severe (>50%). Whole lungs weresegmented utilizing deep learning-based segmentation. Altogether, 107 features including shape, first-orderhistogram, second and high order texture features were extracted. Pearson correlation coefficient (PCC≥90%)followed by different features selection algorithms were employed. ML-based supervised algorithms (Naïve Bays,Support Vector Machine, Bagging, Random Forest, K-nearest neighbors, Decision Tree and Ensemble Metavoting) were utilized. The optimal model was selected based on precision, recall and area-under-curve (AUC) byrandomizing the training/validation, followed by testing using the test set.Results: Nine pertinent features (2 shape, 1 first-order, and 6 second-order) were obtained after features selectionfor both phases. In diagnostic task, the performance of 3-class classification using Random Forest was 0.909±0.026, 0.907±0.056, 0.902±0.044, 0.939±0.031, and 0.982±0.010 for precision, recall, F1-score, accuracy,and AUC, respectively. The severity prediction task using Random Forest achieved 0.868±0.123 precision, 0.865±0.121 recall, 0.853±0.139 F1-score, 0.934±0.024 accuracy, and 0.969±0.022 AUC.Conclusion: The two-phase ML-based model accurately classified COVID-19 and pneumonia patients using CTradiomics, and adequately predicted severity of lungs involvement. This 2-steps model showed great potential inassessing COVID-19 CT images towards improved management of patients.

AB - Objective: To develop a two-step machine learning (ML) based model to diagnose and predict involvement oflungs in COVID-19 and non COVID-19 pneumonia patients using CT chest radiomic features.Methods: Three hundred CT scans (3-classes: 100 COVID-19, 100 pneumonia, and 100 healthy subjects) wereenrolled in this study. Diagnostic task included 3-class classification. Severity prediction score for COVID-19 andpneumonia was considered as mild (0-25%), moderate (26-50%), and severe (>50%). Whole lungs weresegmented utilizing deep learning-based segmentation. Altogether, 107 features including shape, first-orderhistogram, second and high order texture features were extracted. Pearson correlation coefficient (PCC≥90%)followed by different features selection algorithms were employed. ML-based supervised algorithms (Naïve Bays,Support Vector Machine, Bagging, Random Forest, K-nearest neighbors, Decision Tree and Ensemble Metavoting) were utilized. The optimal model was selected based on precision, recall and area-under-curve (AUC) byrandomizing the training/validation, followed by testing using the test set.Results: Nine pertinent features (2 shape, 1 first-order, and 6 second-order) were obtained after features selectionfor both phases. In diagnostic task, the performance of 3-class classification using Random Forest was 0.909±0.026, 0.907±0.056, 0.902±0.044, 0.939±0.031, and 0.982±0.010 for precision, recall, F1-score, accuracy,and AUC, respectively. The severity prediction task using Random Forest achieved 0.868±0.123 precision, 0.865±0.121 recall, 0.853±0.139 F1-score, 0.934±0.024 accuracy, and 0.969±0.022 AUC.Conclusion: The two-phase ML-based model accurately classified COVID-19 and pneumonia patients using CTradiomics, and adequately predicted severity of lungs involvement. This 2-steps model showed great potential inassessing COVID-19 CT images towards improved management of patients.

KW - COVID-19

KW - CT images

KW - Diagnosis

KW - Machine learning

KW - Pneumonia

KW - Prediction

KW - Radiomics

UR - http://www.scopus.com/inward/record.url?scp=85139333741&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85139333741&partnerID=8YFLogxK

U2 - 10.1016/j.compbiomed.2022.106165

DO - 10.1016/j.compbiomed.2022.106165

M3 - Article

C2 - 36215849

AN - SCOPUS:85139333741

SN - 0010-4825

VL - 150

JO - Computers in Biology and Medicine

JF - Computers in Biology and Medicine

M1 - 106165

ER -

Two-step machine learning to diagnose and predict involvement of lungs in COVID-19 and pneumonia using CT radiomics

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Cite this