Usefulness of machine learning in COVID-19 for the detection and prognosis of cardiovascular complications

Since January 2020, coronavirus disease 2019 (COVID-19) has rapidly become a global concern, and its cardiovascular manifestations have highlighted the need for fast, sensitive and specific tools for early identification and risk stratification. Machine learning is a software solution with the ability to analyze large amounts of data and make predictions without prior programming. When faced with new problems with unique challenges as evident in the COVID-19 pandemic, machine learning can offer solutions that are not apparent on the surface by sifting quickly through massive quantities of data and making associations that may have been missed. Artificial intelligence is a broad term that encompasses different tools, including various types of machine learning and deep learning. Here, we review several cardiovascular applications of machine learning and artificial intelligence and their potential applications to cardiovascular diagnosis


Introduction
As the coronavirus disease 2019 (COVID-19) pandemic quickly spread across the globe in the winter and spring of 2020, the cardiovascular (CV) manifestations induced by this viral infection became an important facet of this multisystem disease. Several publications have noted myocarditis, cardiomyopathy, systolic and diastolic dysfunction, pump failure, pericardial involvement cardiac arrhythmias including sudden cardiac death and pulseless electrical activity as possible cardiac complications in patients with COVID-19 (Bin et al., 2016;Ruan et al., 2020; (Table 1). Historically, it is well known that a higher incidence of cardiac events is also associated with seasonal influenza and other respiratory viral outbreaks, suggesting that acute viral infections can elicit similar cardiac effects via proinflammatory pathways and prothrombotic changes although the intensity and extent of both seems to be higher in COVID-19 (Nguyen et al., 2016). In a study of CV complications of severe acute respiratory syndrome (SARS) in 121 patients, 50.4% developed hypertension in the hospital (P. . Of these patients, 71.9% developed persistent tachycardia, with 40% exhibiting continued tachycardia at outpatient follow up. Although tachycardic CV complications were common in SARS, they were usually self-limiting and not associated with increased mortality. There is a case report of middle eastern respiratory syndrome-related coronavirus (MERS-CoV) acute myocarditis and acute onset heart failure, manifesting as acute myocardial edema and injury of the apical and lateral walls of the left ventricle, but this was an isolated case (Alhogbani, 2016).
Most individuals with COVID-19 present with fever, cough, respiratory symptoms, headache, diarrhea and loss of smell/taste . Nonetheless, accurately distinguishing between infection by COVID-19 versus other infectious and noninfectious etiologies can be difficult at presentation. Herein, diagnostic algorithms based on constellation of symptoms/signs and initial testing incorporating biomarkers, serology, PCR, chest imaging may assist in increasing diagnostic accuracy so that patients may be quickly diagnosed correctly and triaged expeditiously. This will allow unburdening of busy emergency rooms and reduce pressure on healthcare providers. Artificial intelligence (AI) and machine learning (ML) can potentially be of significant utility in this field. Chest computed tomography (CT) is widely used for the management of COVID-19 pneumonia because of its wide availability and short scan time. AI techniques utilizing CT imaging data have been applied to screen for and assess the severity of COVID-19 pneumonia as well as distinguish it from other forms of pneumonia (F. . For example, a deep learning (DL) based method was developed based on analysis of thoracic CT images that lead to automated detection and monitoring of COVID-19 patients over time (Gozes et al., 2020). Another DL method was able to detect COVID-19 pneumonia and distinguish it from community acquired pneumonia using chest CT (J. . When it comes to CV involvement, just as in pulmonary involvement with COVID-19, there is considerable clinical variability in involvement of the CV system also between different individuals with similar baseline clinical risk features, and even individuals within the same family. This may have to do with host factors, genetics, immune responsiveness, viral load and other yet unidentified disease modifiers. Currently, there are no predictive models that allow physicians to quantify CV risk in patients with COVID-19. As the experience with the COVID-19 pandemic grows, the treatment pathway for cardiac presentations will likely evolve. More advanced approaches involving data collection that can be used to train diagnostic algorithms via ML and AI technologies are currently underway as healthcare professionals struggle with practical difficulties of working in high stress and

Overview of AI and ML tools
AI is a field of computer science that focuses on teaching computers to learn complex tasks and make predictions. ML is a subset of AI that can learn from data and identify patterns and aid in decision making with minimal human intervention. DL is a form of ML that typically utilizes multi-layered neural networks to learn complex hierarchical representations of data that constitute multiple levels of abstraction (Johnson et al., 2018). ML strategies can broadly be split into supervised and unsupervised learning. In supervised learning, labelled data is available for the training process and creates a ground truth and involves classification of an observation into one or more categories or outcomes. In unsupervised learning, the software analyzes large amounts of unlabeled data to identify hidden patterns or structure within the data, which greatly increases the amount of data that can be analyzed (Johnson et al., 2018). Supervised ML has found the widest applicability to date. ML algorithms frequently employed in practice include linear and logistic regression, artificial neural networks (ANN), support vector machines (SVM), tree-based methods, and neural networks and DL (Table 2) (Al'Aref et al., 2019). Datasets in ML projects are typically partitioned into training, validation, and test subsets. The training set is used for the development of the model while the validation set is used to estimate overall model performance and fine tune its parameters. Thereafter, repeated training and cross validation are performed to remove any variance. Finally, an external test can be used to assess its generalizability once optimized. Ultimately, the challenge is to perform all these steps efficiently while avoiding over-fitting (occurs when a model learns the detail and the noise in the training data that the actual dataset does not generalize).

Potential cardiovascular applications
AI has great potential to assist in pandemic modeling and in the diagnosis of COVID-19 clinical manifestations. AI is particularly suited for analyzing massive amounts of imaging data, which has been utilized for making the diagnosis of COVID-19 pneumonitis, but is also potentially suitable for cardiac applications (Neri et al., 2020). Currently, there are a few ongoing clinical trials investigating COVID-19 and its effects on the CV system using AI (Table 3). An overview of diagnostic testing in patients COVID-19 and CV involvement is provided in Table 4.

Echocardiography
There are several features of COVID-19 apparent on echocardiography that are mainly related to the severity of the disease and its CV complications. In the early stages following the systemic inflammatory response, there can be evidence of hyperdynamic cardiac function manifested by an increase in cardiac output and left ventricular ejection fraction with or without a decrease in systemic vascular resistance. Other abnormal findings include acute stress-induced cardiomyopathy (Takotsubo) characterized by apical ballooning and left ventricular segmental contraction abnormalities, right ventricular enlargement with associated pulmonary hypertension, and diffuse myocardial inhibition (usually in the later stages resulting from severe hypoxia) .
Echocardiography is a useful tool quickly identifying a patient's hemodynamic status, differentiating between different etiologies of shock, and monitoring. It is especially informative and uniquely feasible in critically ill patients, many of whom are proned and on multiple intravenous drips and difficult to transport to CT imaging suites. With the data that can be obtained from clinical echocardiography, there is ample opportunity for the development of echocardiography-based AI platforms. Innovations using AI have the potential to standardize practice, optimize accuracy, and ultimately improve clinical workflow. ML offers the potential to improve the accuracy and reliability of echocardiography by combining clinician interpretation with information derived from ML algorithms and can provide additional predictive information that may be too subtle to be detected by the human eye. High volume data generated from cardiac imaging can be integrated in a multiparametric approach for pattern recognition and imaging data-based disease phenotype characterization (Alsharqi et al., 2018).
The increase in accuracy combined with the time saving benefits demonstrate the value of incorporating ML into the field of echocardiography during this pandemic as well as in general. Currently, there are two studies underway in the United States attempt-

Algorithm Description Illustration
Support vector machine (SVM) SVM classifier is constructed by projecting data into higher dimensional space mapping by using kernel functions and devising a new boundary within the new space which can be used for classification, regression, or other tasks such as outlier detection. The disadvantage to SVM is that the classification is dichotomous, and no probability of class membership is given.
k -Nearest neighbor (KNN) In KNN, every object being classified to its k, the number of nearest neighbors to include in the estimate of class membership.
Decision trees This algorithm repeatedly splits the data set according to a criterion that maximizes the separation of the data, a process known as recursive partitioning, resulting in a tree-like structure Artificial neural networks (ANNs) These networks consist of nodes called neurons arranged in a network layout with different layers (input, hidden, and output) that are connected to each other by weighted edges. A neuron in a hidden layer is activated when an input neuron passes a large enough value to trigger the next neuron. Activated neurons continue to pass value to the next layer of neurons until an output is generated upon reaching the final layer.
Regularized regression Allows for introduction of constraints when the number of variables in a system exceeds the number of observations and improve the generalizability the model. Some common forms of regression include LASSO (least absolute shrinkage and selection operator) regression, ridge regression, and elastic net regression.
ing to use AI techniques to generate predictive models that can forecast COVID-19 related CV complications using echocardiography. A multi-site study lead by the Mayo Clinic is attempting to pioneer an AI platform with EchoGO Core AI software that will be able to automatically analyze images obtained via echocardiography in order to map how the virus attacks the heart (DAIC., 2020). This study will include 500 COVID-19 positive men and women between ages 18 and 89. The objective is the assessment of automated cardiac measurements, ejection fraction, and global longitudinal strain to classify cardiac outcomes in COVID-19 patients.
Another study at Johns Hopkins University is currently collecting data from 300 confirmed COVID-19 patients including cardiac specific laboratory markers, continuously obtained vital signs, and imaging data from echocardiography and CT scans to train an ML algorithm (Stempniak, 2020). The goal of the study is to develop a predictive risk score that will be able to predict cardiac events up to 24 hours ahead of time and perform baseline risk stratification on new patients. Given the vast amount and complexity of data associated with echocardiography, AI platforms will need to include many studies that encompass a wide array of clinical characteristics, pathologic features, image quality, and ultrasound vendors.

Genetics
One of the main aims within the field of genomics is to characterize gene function and identify connections between genotypes and phenotypes, which is crucial for the development of prediction models and the development of precision medicine. DL can be used to make accurate and quick large-scale genomic association studies19 (Ho et al., 2019).
Human angiotensin converting enzyme (ACE2) is an entry receptor for the SARS-CoV-2 Spiked glycoprotein6,20 P. Zhou et al., 2020). Fang et al. hypothesized that ACE2 stimulating drugs could worsen clinical outcomes in COVID-19 infections by increasing the expression of ACE2. A recent single cell atlas of the human heart showed that pericytes exhibit high levels of ACE2, which implies that the local inflammation generated during COVID-19 infection of the pericytes may lead to microvascular dysfunction leading to myocardial infarction with nonobstructive coronaries (L. . ACE2 genetic polymorphism has been shown to affect the binding ability of the virus, which could suggest a possible genetic predisposition to COVID-19 infection . Therefore, it is possible that ML approaches may be able to identify reliable features for  (Twerenbold and Pfister, 2020) 1500 i) perform extensive clinical and biomarker phenotyping in COVID-19 suspects presenting to the emergency department (ED) and in COVID-19 patients with subsequent ICU admission, ii) compare clinical and biomarker profiles of COVID-19 patients with a control group, iii) derive and validate personalized risk prediction models for early clinical decision support, and iv) explore pathophysiological mechanisms including inflammatory and CV pathways.

Basel, Switzerland
Recruiting risk assessment and classification by using biochemical and clinical data (e.g. ACE2 presence and expression level). ML could also be used to analyze genetic variants from patients with mild to severe COVID-19 infections to risk stratify patients based on their vulnerability or resistance to infection.
Interleukin-6 (IL-6) is another possible molecular target contributing to CV morbidity and mortality in COVID-19 patients. There is experimental evidence supporting its role in atherosclerosis as well as cardiac fibrosis and failure (Brauner et al., 2018;Kusters et al., 2018;van der Heijden et al., 2018;Wang et al., 2019). There is also evidence demonstrating a possible role in plaque formation and with the development and progression of abdominal aortic aneurysm (Huber et al., 1999;Nishihara et al., 2017). Additionally, genetic variants with increased circulating levels of IL-6 receptors (and therefore reduced levels of IL-6) have been shown to protect against coronary artery disease (CAD) (Sarwar et al., 2012). Other cytokine moieties involved in COVID-19 related cytokine release syndrome may also contribute to myocardial tissue damage and are potential targets for large scale data analysis. These include pathways mediated by granulocytemacrophage colony-stimulating factor (GM-CSF), tumor necrosis factor α (TNF α), interleukin-17 (IL-17), interleukin 18 (IL-18), and interferon γ (IFNγ) (Guzik et al., 2020).

Risk assessment
Severe COVID-19 is characterized by rapidly progressive systemic inflammation, sepsis, multiorgan failure, and death. There is often a delay between onset of symptoms and CV manifestations, which presents a challenge when it comes to characterizing the risk for myocardial damage and CV complications in patients with COVID-19. This requires a broad evaluation of numerous variables in order to identify patients at risk that would not be identified with standard statistical analysis. ML is a robust tool that can incorporate nontraditional and unknown risk factors that may be used in CV risk stratification. A few studies have demonstrated a potential role for ML in this regard. An important feature of ML is that it can aid cardiologists in making accurate predictions regarding CV risk in different settings. A study by Kwon et. al. developed a DL algorithm that was able to detect in-hospital cardiac arrest and death without attempted resuscitation31 (Kwon et al., 2018). It performed better than standard methods and had a higher sensitivity and lower false alarm rates. Another study by Alaa et al. (2019) created an ML automated tool based on a dataset of more than 400,000 people with over 450 different variables to risk stratify patients without a history of CV disease. This tool improved CV risk prediction compared to the standard Framingham risk score and discovered new CV risk factors and interactions between different features of an individual. Ongoing efforts have been to develop novel diagnostic tools using ML algorithms that can identify patients infected with COVID-19. For example, ML based screening of COVID-19 assay designs using a CRISPRbased system for detecting the virus showed a higher sensitivity and speed (Metsky et al., 2020). Currently, there are no approaches available to predict CV outcomes in COVID-19 patients. There is a prospective cohort study in process that is examining data to develop a predictive model in northern Italy, which is one of the regions in the world most affected by COVID-19 (Parati, 2020). This study is enrolling 5500 COVID-19 patients and will be using ML techniques to develop multivariate models for risk stratification. They will be collecting bio-humoral and imaging data to explore the prognostic and pathophysiological role of immunologic factors, activation of coagulation, endothelial dysfunction, inflammatory response, genetic, hormonal, and metabolic factors and acute cardiac damage. The study is expected to complete in September 2020.

Electrophysiology
The contribution of COVID-19 to the development of arrhythmias is well described but specific mechanisms are still being elucidated. However, most patients with COVID-19 or in whom COVID-19 is suspected will have a baseline electrocardiogram (ECG) performed at the time of entry into the healthcare system, particularly those with severe disease or those in whom QT prolonging medications may be used (Gandhi et al., 2020). AIenhanced, cost-effectively acquired ECG may allow for opportunities to screen for disorders not typically associated with ECG (Lopez-Jimenez et al., 2020). One of the major aims for AI technologies applied to electrophysiologic data is to improve population health via interpretable insight. This is based on the idea that ECGs often contain subtleties that may not always be readily interpretable by humans (Attia et al., 2016). As an example, one study reported that an AI enabled ECG alone was able to identify low EF with a high degree of accuracy (Attia et al., 2019). It is possible that other conditions may be similarly identified by AI enabled ECGs, which would allow for improved risk stratification at a low cost. Additionally, with the use of consumer smartphone or smartwatch enabled ECG devices, we may be able to offer more cost-effective screening at a general population level, especially for individuals infected with SARS-CoV-2 but with milder disease severity not requiring hospitalization and inpatient telemetry monitoring. However, the expertise in reading ECGs is limited to healthcare settings. Preliminary data suggests that AI techniques may improve interpretation of these strips and facilitate appropriate triaging to those needed to be seen by a physician (Hagiwara et al., 2018;Tison et al., 2018). There is a prospective cohort trial currently taking place in France that is evaluating a new method for remote monitoring of corrected QT measurements using AI based solution and ECG data collected via smartwatches (AI-QTc) in patients being treated with hydroxychloroquine (Assistance Publique Hopitaux De Marseille., 2020). AI-QTc data will be compared with the standard manual QTc reviewed by a cardiologist.

Advanced imaging
There are studies examining the role of AI to detect COVID-19 on nuclear imaging with chest CT, but there has been only one study to date looking at advanced cardiac imaging in patients with COVID-1942,43 (L. Li et al., 2020Puntmann et al., 2020) . This study was a prospective cohort of unselected patients with recent COVID-19 infection identified from a local testing center who underwent voluntary evaluation for cardiac involvement with cardiac MRI (CMR). They demonstrated cardiac involvement in 78% of patients and ongoing myocardial inflammation in 60% with recent COVID-19 disease, independent of pre-existing comorbid conditions, severity of acute illness, and time since original diagnosis. There was no significant trend toward reduction of imaging or serologic findings during the recovery period. These findings suggest an indication of considerable inflammatory disease burden and urgently require confirmation in a larger cohort.
DL methods have shown promising performance for automated calculation of left ventricular function. One study used a convolutional neural network on a dataset of 596 CMR examinations acquired in different institutions and on scanners from different vendors to create a tool that outperformed manual segmentation, and its accuracy increased with the heterogeneity of the patients that were included (Tao et al., 2019). ML can also be used for automated segmentation of the heart. This is important because segmentation of the left ventricle from the epicardium and endocardium is important for the assessment of CV function. A study by Ngo et al. (2017) used ML for automated segmentation of the heart with a dataset of 45 cardiac cine magnetic resonance with ischemic and nonischemic cardiomyopathy, left ventricular hypertrophy, and normal LV function. The accuracy was comparable to standard methods.
Evaluation of myocardial perfusion can be performed using single-photon emission computed tomography (SPECT). ML has been used in the past to improve diagnostic performance and combine imaging modalities to maximize discriminatory capability. In one instance, an SVM approach was utilized to localize the mitral valve during SPECT acquisition, which is important for the assessment of myocardial perfusion. 392 SPECT scans were used for training and validation, and the ML model exhibited an AUC (area under the curve) of 82 for the detection of obstructive CAD in comparison to two experts (AUC of 0.79 and 0.81) and unadjusted mitral valve plane (AUC 0.63) (Betancur et al., 2017). Based on their study results, the authors suggested that AI could lead to integration of clinical and imaging data and create personalized MACE risk calculations in patients undergoing SPECT MPI. Another large multi-center study that included 1,638 patients applied DL directly to nuclear cardiology polar map images (Betancur et al., 2018). In this study, they demonstrated that DL could outperform the clinical standard (total perfusion deficit) for the identification of obstructive CAD. An AI-driven algorithm has been incorporated into an FDA approved nuclear imaging software that uses a CDS tool and natural language for automated report generation. The system includes over 230 rules of perfusion, reversibility, function, and additional information such as prone vs. supine positioning (Garcia et al., 2018). There was found to be no significant difference between the AI driven structured report and the experts' impressions of CAD or ischemia. ML has also been used to assess coronary artery dysfunction using positron emission tomography (PET) in combination with coronary computed tomography angiography (CCTA) with an algorithm that combined quantitative stenosis, plaque burden, and myocardial mass (Dey et al., 2015). This algorithm generated a risk score to predict impaired myocardial flow reserve quantified by PET and showed that the ability to discriminate coronary dysfunction was superior with the ML derived risk score with an AUC of 0.83 compared to an AUC of 0.66 with quantitative stenosis.

Biohumoral exams
CV comorbidity seems to be correlated with impaired outcomes in COVID-19 patients. A direct causal relationship or pathophysiological mechanism and the clinical value of estab-lished and emerging biomarkers remains unknown. S.  demonstrated that approximately 20% of patients diagnosed with COVID-19 showed signs of cardiac injury as measured by high-sensitivity troponin I (hs-TNI) levels. Additionally, these patients presented with higher levels of C-reactive protein (CRP), NT-proBNP, procalcitonin, leukocytes, and creatinine. Whether this occurs as a result of underlying CV disease that confers risk in the setting of hypoxia and viral sepsis vs. direct viral myocarditis is unknown. Furthermore, ethnic and racial variation will bring new challenges to the US in terms of diagnosis and management of patients with COVID-19. The Chinese experience with COVID-19 patients has shown high comorbidity with hypertension, diabetes, and cerebrovascular disease . Shi et. al. noted common comorbidities associated with cardiac injury including hypertension, diabetes, CAD, heart failure, and cerebrovascular disease --all more common in African American patients. Subclinical CV disease tends to occur earlier in African American patients, and they have higher circulating biomarkers of systemic inflammation (Hackler et al., 2019;Lefferts et al., 2017). As such, African Americans in the US may be a particularly vulnerable group to the effects of COVID-19. Pathophysiological mechanisms, differences in mild and severe variants of COVID-19, and clinical factors including ethnicity and underlying comorbidities can be studied extensively using ML techniques. An early and reliable personalized risk prediction represents a major unmet clinical need as it could allow evidence based clinical decision aid for effective resource allocation during the COVID-19 pandemic. Currently, there is a prospective case-control study underway in Switzerland aiming to deliver a platform using ML techniques to develop personalized risk prediction models (Twerenbold and Pfister, 2020). The study will enroll 1500 patients and will be looking at patient clinical phenotyping (e.g. comorbidities, medications, symptoms, vitals, ECG, and imaging data) and extended laboratory analysis and blood sampling. The study will be complete in June 2021.

Conclusions
The applications of AI, including ML, have the capability to exploit data-rich platforms and transform approaches to diagnose, risk stratify, prevent, and treat CV disease. There have been advances in AI over the years within different areas of cardiology, and the extrapolation of this technology towards the evaluation of COVID-19 patients would also prove useful. Advancements with AI will require close collaboration amongst computer scientists, clinicians and investigators in order to identify the most appropriate and relevant problems to be solved and the best approaches to do so.