throbber
Curr Treat Options Cardio Med
`DOI 10.1007/s11936-020-00814-0
`
` (2020) 22:15
`
`State-of-the-Arts Informatics (C Stultz, Section Editor)
`
`Deep Learning
`for Cardiovascular Risk
`Stratification
`Daphne E. Schlesinger1,2,3,4
`Collin M. Stultz, MD, PhD1,2,3,4,5,6,*
`
`Address
`1Harvard-MIT Division of Health Sciences and Technology, Cambridge, MA, 02139, USA
`2Institute for Medical Engineering and Science, MIT, Cambridge, MA, 02139, USA
`3Research Laboratory of Electronics, MIT, Cambridge, MA, 02139, USA
`4Computer Science & Artificial Intelligence Laboratory, MIT, Cambridge, MA,
`02139, USA
`5Department of Electrical Engineering and Computer Science, MIT, Cambridge,
`MA, 02139, USA
`*,6Division of Cardiology, Massachusetts General Hospital, Boston, MA, USA
`Email: cmstultz@mit.edu
`
`* The Author(s) 2020
`
`This article is part of the Topical Collection on State-of-the-Arts Informatics
`
`Keywords Risk stratification I Deep learning I Risk models
`
`Abstract
`
`Purpose of review Although deep learning represents an exciting platform for the devel-
`opment of risk stratification models, it is challenging to evaluate these models beyond
`simple statistical measures of success, which do not always provide insight into a model’s
`clinical utility. Here we propose a framework for evaluating deep learning models and
`discuss a number of interesting applications in light of these rubrics.
`Recent findings Data scientists and clinicians alike have applied a variety of deep learning
`techniques to both medical images and structured electronic medical record data. In many
`cases, these methods have resulted in risk stratification models that have improved
`discriminatory ability relative to more straightforward methods. Nevertheless, in many
`instances, it remains unclear how useful the resulting models are to practicing clinicians.
`Summary To be useful, deep learning models for cardiovascular risk stratification must not
`only be accurate but they must also provide insight into when they are likely to yield
`inaccurate results and be explainable in the sense that health care providers can under-
`stand why the model arrives at a particular result. These additional criteria help to ensure
`that the model can be faithfully applied to the demographic for which it is most accurate.
`
`APPLE 1075
`Apple v. AliveCor
`IPR2021-00970
`
`1
`
`

`

` 15
`
`Page 2 of 14
`
`Curr Treat Options Cardio Med (2020) 22:15
`
`Introduction
`
`Accurate risk stratification remains a central theme in all
`stages of the care of patients with cardiovascular disease.
`Indeed, the likelihood that any patient will benefit from
`a given therapeutic intervention is a function, in part, of
`the risk associated with the intervention itself versus the
`risk that the patient will have an adverse event if no
`intervention is performed. Informed clinical decision
`making necessitates gauging patient risk using available
`clinical information.
`A number of societal guidelines recommend the use
`of validated risk scores in the initial evaluation of pa-
`tients with suspected coronary disease [1–3]. The use of
`accurate risk scores helps to ensure that patients who are
`at high risk of adverse outcomes are quickly identified
`and assigned a therapy that is appropriate for their level
`of risk. Nevertheless, risk stratification is far from a
`perfect science, and risk scores often fail to identify
`patients at high risk of inimical outcomes. This problem
`is made more apparent in light of the fact that a relative
`minority of patients with cardiovascular disease experi-
`ence the gravest adverse outcomes. Moreover, while the
`prevalence of adverse events in high-risk populations is,
`by definition, large, the absolute number of events is
`also large in patients who are predicted to be low risk
`using traditional risk prediction metrics. This low risk-
`
`high number dilemma is frequently encountered in many
`areas of cardiovascular clinical research [4]. As such,
`adequately identifying patient subgroups who are truly
`at high risk of adverse events remains a clear unmet
`clinical need. Novel methods are therefore needed to
`realize the full potential of clinical risk stratification
`from existing clinical observations. Machine learning
`and deep learning, in particular, holds the potential to
`robustly identify high-risk patient subgroups, suggest
`personalized interventions that can reduce a given pa-
`tient’s risk, and help ensure that appropriate resources
`are allocated to those patients who are in the most need.
`In this review, we do not strive to review all of the
`relevant literature in the area of deep learning in cardio-
`vascular medicine. Indeed, this review is written for the
`practicing clinician and strives to provide intuitive ex-
`planations for how deep learning models actually work
`and where they are most applicable. As the use of these
`models becomes ubiquitous in the clinical arena, it will
`be important for health care providers to critically eval-
`uate them in order to determine the clinical usefulness
`of any given machine learning approach. Our goal is to
`provide a general framework for understanding what
`advantages these models hold and what considerations
`limit their broad applicability.
`
`Conventional approaches to risk stratification
`
`The term machine learning is believed to have been originated by Arthur Samuel,
`an engineer and scientist who pioneered artificial intelligence in 1959 [5]. He
`described it as “programing computers to learn from experience.” There are diverse
`examples of machine learning in the clinical literature, including straightforward
`approaches like logistic regression and Cox proportional hazards modeling and
`more esoteric techniques like deep learning, which is described in the next section.
`Indeed, the former methods have actually been a part of the clinical literature for
`some time [6–8]. Therefore, while the term machine learning has only recently
`entered the medical lexicon, a number of existing clinical risk scores were devel-
`oped and refined using approaches that fall under this umbrella term. The
`exorbitant list of such models is too lengthy to exhaustively review here. Instead,
`we focus on some approaches that are commonly used to assess patient risk.
`One of the earliest models for quantifying the risk of adverse cardiovascular
`outcomes was developed by Killip et al. in 1967, where 250 patients were
`divided into four simple classes of increasing severity of illness, ranging from
`no clinical signs of heart failure to cardiogenic shock [9]. The primary goal of
`this study was to trial an improved workflow for cardiac intensive care, but the
`
`2
`
`

`

`Curr Treat Options Cardio Med (2020) 22:15
`
`Page 3 of 14 15
`
`data collected over the course of study revealed patterns in patient survival
`based on their class (now called the Killip class). The utility of these classes for
`identifying high risk patients has been born out in a number of studies, and
`these classes remain a part of the clinical assessment of patients who present
`with an acute myocardial infarction.
`Over time, more sophisticated statistical techniques have been used to
`develop more sophisticated risk stratification models. Both the Framingham
`risk score—which quantifies the risk of adverse events (death from coronary
`heart disease, nonfatal MI, angina, stoke, transient ischemic attack, intermittent
`claudication, and heart failure) in patients who had no prior history of cardiac
`disease—and the Global Registry of Acute Coronary Events (GRACE)
`score—which quantifies all-cause mortality in patients who present with an
`ACS—were developed using Cox proportional hazards regression [10, 11].
`Another class of risk scores, developed from and named for the Thrombolysis
`in Myocardial Infarction (TIMI) study groups, was developed specifically for
`patients who present with symptoms consistent with an acute coronary syn-
`drome. Here, features that were discriminatory with respect to the combined
`outcome of all-cause mortality, new or recurrent MI, or severe recurrent ischemia
`in their cohort were selected using logistic regression. Seven features were selected
`in the final model. To use the risk score itself, the physician simply counts the
`number of features that are present to estimate the short-term risk of either
`mortality after a myocardial infarction post ST segment elevation MI or a com-
`bined outcome of all-cause mortality, new or recurrent MI, or severe recurrent
`ischemia requiring revascularization post non-ST segment elevation ACS [12, 13].
`Regression modeling has found a role for quantifying patient risk in other
`disorders apart from ischemic heart disease. Pocock et al., for example, performed
`a meta-analysis of heart failure patients from 30 different studies, amounting to
`39,372 patients. They used multivariable piecewise Poisson regression methods
`to identify features that are predictive of mortality at 3 years. These features were
`then converted into an integer risk calculator, called the Meta-analysis Global
`Group in Chronic Heart Failure (MAGGIC) score, with higher values correspond-
`ing to greater risk [14]. Similarly, the Seattle Heart Failure Model was developed
`on a cohort of 1125 patients, using a multivariate Cox proportional hazards
`model. This model provides estimates for 1-, 2-, and 3-year mortalities [15, 16].
`Logistic regression and proportional hazard models are advantageous because
`they are easy to interpret: each clinical feature in the model has an associated weight
`that corresponds to how important that feature is for the model arriving at a
`particular result. However, such models are relatively simple and cannot necessarily
`capture complex mechanisms relating observations and outcomes of interest.
`
`What is deep learning?
`
`The diverse, nonuniform terminology in the medical literature unfortunately
`tends to obfuscate the meaning of the term “deep learning.” Deep learning is a
`subfield of machine learning that strives to find powerful abstract representa-
`tions of data using complex artificial neural networks (ANNs) that are then used
`to accomplish some prespecified task. While these abstract data representations
`are powerful ways to describe clinical data, they are difficult to comprehend and
`explain; that is why they are, indeed, “abstract.”
`
`3
`
`

`

` 15
`
`Page 4 of 14
`
`Curr Treat Options Cardio Med (2020) 22:15
`
`ANNs correspond to a class of machine learning algorithms whose algorith-
`mic structure is inspired by structure of the human brain and how it is believed
`that humans compute [17, 18]. A neural network consists of interconnected
`artificial neurons that pass information between one another. A typical ANN
`contains an input layer, which contains several artificial neurons that take
`clinically meaningful data as input. The input layer then passes the clinical data
`to other inner, or “hidden,” layers, each of which performs a series of relatively
`simple computations. At each layer, more abstract representations of the input
`data are obtained. Eventually, the information is passed to an output layer that
`yields a clinically meaningful quantity (Fig. 1).
`Deep learning models, in practice, correspond to neural networks that
`contain several hidden layers. These models, originally referred to as multilayer
`perceptrons, were popularized in the early 1980s for applications such as image
`and speech recognition, then receded in popularity in favor of simpler, easier to
`
`Fig. 1. In our applications, a neural network acts as a function that takes some observations as input and produces some prediction
`of outcomes as the output (a). This function is generated by adding many simple functions (represented by circular nodes that
`process information), each of which takes all the outputs of the previous layer as its input, which renders a network “fully
`connected” (b). These simple functions are strictly increasing and include parameters ( w! ið Þ
`b ið Þ for each node), which are
`chosen by training the network (c). Each layer can be though of an abstraction of the data, which is eventually
`separable in the last layer if the model works well. The output of the last layer is the probability of an adverse
`event, which a clinician may use to inform her clinical decisions (d).
`
`;
`
`4
`
`

`

`Curr Treat Options Cardio Med (2020) 22:15
`
`Page 5 of 14 15
`
`train, and perhaps more explainable models [19, 20]. In recent years, however,
`deep neural network (DNN) learning has resurged dramatically both because of
`the availability of so-called “big data” and the development of computational
`methods that facilitate the training of large neural networks. In many of today’s
`applications, these networks can be quite large, having on the order of 105–106
`artificial neurons and millions of modifiable parameters. Parenthetically, as the
`size of clinical datasets is typically much smaller, care must be taken when
`implementing these models to ensure that they are not overtrained.
`While the structure of ANNs, and DNNs in particular, are inspired by the
`structure of neurons in the human brain, these models are best thought of as
`universal function approximators. Indeed, it has been mathematically proven
`that any continuous function on compact spaces can be represented by a neural
`network, under certain constraints [21, 22]. These models therefore form an
`efficient platform for generating functions that model complex relationships
`between patient characteristics/features and outcomes. This highlights an im-
`portant difference between DNNs and simpler methods like logistic regression,
`which models the relationship between outcomes (i.e., the logarithm of the
`odds ratio) and patient features as a linear function. By contrast, a DNN
`corresponds to a complex, highly nonlinear function that takes patient infor-
`mation as input (including medical images) and outputs the corresponding
`outcome. An additional advantage of DNNs is that they can use input data in
`“raw” form, with little preprocessing.
`Deep learning models can, in principle, capture complex, nonlinear, rela-
`tionships between patient features and outcomes and therefore necessarily meet
`the first criteria. However, because these models generate abstract representa-
`tions of the input data, it can be very difficult to understand what the model has
`learned and consequently why the model arrives at a particular result. More-
`over, understanding when the model will fail—i.e., which patients are most
`likely to be associated with an incorrect prediction—can be just as challenging.
`
`Evaluating deep learning risk models
`
`Standard performance metrics, such as the area under the receiver operating
`characteristic curve (AUC), accuracy, and the sensitivity/specificity, provide
`useful information for gauging how a risk model will perform, on average.
`Nevertheless, these metrics do not by themselves offer any interpretative in-
`sights, nor do they help the user understand how the model will perform on any
`individual patient. The upshot being that conventional statistical metrics of
`success are not always sufficient to determine the clinical utility of a deep
`learning model.
`When evaluating applications of machine learning to medical problems,
`there are particular criteria that must be considered given our current under-
`standing of human physiology and the reality of medical practice (Fig. 2). In
`addition to having a level of performance that ensures that it will perform well,
`on average, on the population of interest, ideally a good algorithmic solution
`should also:
`1. Provide information about potential failure modes; i.e., indicate when it is
`likely to yield a false result;
`
`5
`
`

`

` 15
`
`Page 6 of 14
`
`Curr Treat Options Cardio Med (2020) 22:15
`
`Fig. 2. Issues that hinder the clinical acceptance of deep learning models.
`
`2. Be explainable in the sense that clinicians can understand why the model
`arrives at a particular result.
`
`Although determining when a model will fail is challenging, it is an essential
`task. Formally, this can be understood as finding, a priori, patient characteristics
`or subgroups that are associated with incorrect predictions. The development of
`methods that identify such “failure modes” are also a nascent area of research
`within the machine learning community, with most of the published research
`appearing in specialized machine learning conferences or non-peer-reviewed
`online printed archives, with little associated work appearing in the clinical
`literature. Nevertheless, insights into when a model will fail can often be
`garnered if the model itself is explainable; i.e., understanding how/why a model
`arrives a particular result often provides clues as to how the model can yield an
`incorrect result.
`Recently, a new method was described for identifying when a given clinical
`risk score will yield unreliable results [23(cid:129)(cid:129)]. The approach identifies, a priori,
`patient cohorts associated with reduced model accuracy, discriminatory ability,
`and poor calibration. Application to the GRACE risk model correctly identifies
`patient cohorts where the GRACE score has reduced performance. Advantages
`of the method are that it is straightforward to implement and that it can be
`applied to any risk model, regardless of how the risk model was
`developed—thereby making the approach appropriate for deep learning
`models. General methods along these lines will likely play an increasingly
`important role in determining when complex risk models are expected to yield
`useful predictions.
`In addition to deciphering when a given model is likely to fail, developing
`methods that “explain” what a model has learned is an important part of any
`comprehensive strategy that strives to maximize clinical acceptance. Neverthe-
`less, conceptions of explainability or interpretability of machine learning
`
`6
`
`

`

`Curr Treat Options Cardio Med (2020) 22:15
`
`Page 7 of 14 15
`
`models are diverse, and it is difficult to determine exactly what this term means
`in the context of machine learning models. In his article, “The Mythos of Model
`Interpretability,” Zachary Lipton identifies five types of interpretability for
`machine learning models: trust, causality, transferability, informativeness, and
`fairness [24]. Of particular interest for medical algorithms are causality and
`informativeness. Causality describes if the relationships discovered by the
`model are truly causal or merely correlative. While casualty in machine learning
`is an active area of research, it is always very difficult to tease out causal
`relationships from a retrospective analysis of any dataset [25]. An informative
`deep learning model provides some intuition to support how it arrives at a
`given result. In order to impart useful intuitions, however, one needs to trans-
`late the abstract representations learned by a deep learning model into language
`that is easily understood by the health care practitioner. In short, in the medical
`context, we ideally need models that yield insights that are translatable into the
`language of physiology (Fig. 2).
`There are a limited number of tools that have been used to provide
`interpretations/explanations of what a deep neural network has learned. Shape-
`ly values, Gradient-weighted Class Activation Mapping (Grad-CAM) methods,
`and saliency maps represent a class of methods that can provide insight into
`what input features are most responsible for the risk model making a prediction
`[26–28]. Grad-CAM and saliency maps, in particular, are typically used with
`convolutional neural networks (described below) and provide insight into the
`relative importance of different parts of an image for a specific prediction [29].
`For example, consider a model trained to distinguish between different objects,
`such as dogs and humans. A saliency map may reveal that pixels corresponding
`to the legs (four for a dog and two for a human) are most dispositive. Hence, for
`such a simple task as differentiating humans from dogs, saliency maps provide
`easily understood “explanations.” However, for more complex classification
`tasks, saliency maps may not yield such readily interpretable insights. Indeed,
`these methods generally do not provide information about how the data in
`these regions were used to arrive at a particular decision, nor do they necessarily
`provide any causal insights. More generally, it has been argued that the attempts
`to explain deep models are inherently flawed because such post hoc explana-
`tions can never have true fidelity with respect to the original complex model
`[30(cid:129)]. In this vein, the use of interpretable models have an advantage in that
`they are designed to yield explanations that can be understood by domain
`experts. Nevertheless, it is not clear that commonly used interpretable models
`can capture the complex nonlinear relationships described above in manner
`that yields clear explanations. A compromise may be to build models that
`combine both mechanistic/physiologic models and deep learning models to
`enhance both model explainability and predictive performance. This is an active
`area of research.
`It has been argued that clinicians should embrace black box models rather
`than strive to develop explanations that provide insight into how the model
`arrives at a particular result [31]. Proponents of this thesis argue that clinical
`decision-making is frequently rooted in an incomplete understanding of the
`disease process in question and how the potential intervention actually works.
`Hence requiring deep learning models to be explainable holds them to a higher
`standard than other methods used to inform clinical decision making and
`further stymie innovation in this space.
`
`7
`
`

`

` 15
`
`Page 8 of 14
`
`Curr Treat Options Cardio Med (2020) 22:15
`
`While there is merit to this argument, there is little doubt that clinical
`decisions are grounded in some understanding of the disease process. Indeed,
`it is precisely this, albeit imperfect, understanding that guides our therapeutic
`choices. By contrast, deep learning models represent an unprecedented level of
`opaqueness with respect to clinical understanding. In the setting of black
`models and only statistical measures of the model’s overall performance,
`additional information are needed to determine when a model prediction is
`appropriate for a specific patient. While the identification of model failure
`modes and explainability are distinct concepts, they are related. Failure mode
`analyses strive to identify patient subgroups where the model has reduced
`performance, and a comprehensive understanding of how a complex model
`arrives at a particular result provides further assurance that the model is appro-
`priate for a given patient, who has a given set of clinical characteristics. Expla-
`nations that are inconsistent, for example, with our understanding of the
`underlying pathophysiology should not be trusted.
`In sum, it is our view that deep learning models for any clinical application
`should be evaluated using these metrics, in addition to standard statistical
`measures of performance. In what follows we discuss several recent applications
`of deep learning methods for cardiovascular risk stratification and evaluate
`them relative to the metrics discussed above.
`
`Deep learning for risk prediction
`
`Deep learning for image classification has a relatively extensive literature.
`Indeed, the Imagenet challenge—a worldwide competition for classifying mil-
`lions of curated images—has led to the development of many sophisticated
`algorithms for image classification [32]. In a number of applications, these
`image classification algorithms have been modified and fruitfully applied to
`clinical images to quantify patient risk. However, these methods have mainly
`been used for automatic disease diagnosis from pathology slides and radiolog-
`ical scans [33–38]. These algorithms are usually implemented using a class of
`DNNs called convolutional neural networks (CNNs). CNNs are inspired by the
`structure of the mammalian visual cortex, where each neuron “sees” a small
`region of the visual field, called the receptive field of that neuron [39]. In a
`CNN, the information contained in adjacent groups of pixels of an image,
`analogous to the receptive field, is summarized, using a mathematical opera-
`tion called a convolution to create an abstraction of the information in the
`image [40].
`In cardiology, deep learning work has been focused on the automatic
`interpretation of cardiac images, with few applications to the development of
`models that directly quantify patient risk [41]. Recent studies have highlighted
`the ability of CNNs to identify echocardiographic windows using the images
`alone [42, 43], correctly segment the left ventricle in both cardiac CT images and
`cardiac MRIs [44, 45], and accurately detect cardiac MR motion artifacts [46].
`The use of CNNs to garner insights into the risk of future adverse outcomes,
`however, is still a nascent area of investigation.
`A recent study that purports to use medical image data for assessing cardio-
`vascular risk was published by Poplin et al. [47(cid:129)(cid:129)]. In that work, the authors
`used a CNN to predict age, gender, smoking status, systolic blood pressure,
`
`8
`
`

`

`Curr Treat Options Cardio Med (2020) 22:15
`
`Page 9 of 14 15
`
`diastolic blood pressure, and, most importantly, major adverse cardiovascular
`events (MACE) within 5 years from the time that retinal fundus images were
`acquired. The dataset used to develop and validate the model was obtained
`from the UK Biobank and EyePACS (a retinal image database consisting of
`images obtained during routine diabetic screening in clinics in the USA). They
`report an AUC of 0.70 for predicting MACE after 5 years using their deep
`algorithm. This performance exceeds that of predictions made based on single
`risk factors such as age and systolic blood pressure. However, they do not
`outperform an existing, simpler proportional hazards model, SCORE (System-
`atic COronary Risk Evaluation), proposed by Conroy et al. in 2003 [48]. In
`addition to predicting risk, they utilized saliency maps, described above, to
`attempt to explain their algorithm. Saliency maps highlight portions of the
`retinal images that contributed significantly to the predictions their models
`produced. However, the usefulness of these saliency maps is limited because
`they give us no information about the mechanism by which certain features of
`the retina relate to cardiovascular risk and if the deep learning model has
`recapitulated that mechanism.
`Recently, there have been attempts to extend classification algorithms,
`which were originally designed to analyze medical images, to different
`types of data in the Electronic Medical Record (EMR). The EMR can be
`divided into two types of data: structured data and unstructured data.
`Structure medical data refers to what can be found in the pre-existing
`fields with the electronic medical record; e.g., lab results, vital signs, and
`demographic information. Unstructured data refers to what appears in
`medical notes written by health care practitioners. In a recent study,
`Mayampurath et al. assembled structured data from the electronic health
`record into a visual format that could then be used to train a CNN to
`predict in-hospital outcomes [49]. Essentially, the EMR is converted to a
`two-dimensional medical
`image, which enables the use of standard
`machine learning techniques appropriate for medical image processing.
`The image itself maps time on one axis and 156 clinical variables
`(including vital signs,
`laboratory results, medications, diagnostic tests,
`and nurse examinations), recorded over the first 48 h of admission, on
`the other axis. Overall, the discriminatory ability of the best performing
`CNN (the authors considered more than one) was 0.91, suggesting that
`the method holds considerable promise.
`A significant advantage here is that they can leverage methods used
`to “interpret” what CNNs have learned about images to help explain
`why their deep learning model arrives at a particular result. In their
`work, the authors used a standard method—Gradient-weighted Class
`Activation Mapping or Grad-CAM—to understand what clinical features
`are most important for discriminating between patients who die in-
`hospital and those who do not [28]. Not surprisingly, the method finds
`that vital signs, interventions (e.g., mechanical ventilation), and admin-
`istered medications were important
`for distinguishing between those
`who would have an in-hospital event and those who would not. Of
`interest, the model does suggest that simple nursing examinations, rep-
`resented by Morse and Braden scores, may be important for predicting
`in-hospital mortality. Moreover,
`it is noteworthy that there are many
`different ways to organize data arising from the EMR into two-
`
`9
`
`

`

` 15
`
`Page 10 of 14
`
`Curr Treat Options Cardio Med (2020) 22:15
`
`dimensional representations and not all visual representations will have
`the same prognostic information. The authors of this study only exper-
`iment with three different ways to organize the data.
`While these results are encouraging, the problem of predicting in-
`hospital mortality using 48 h of admission data may be, relatively
`speaking, not that difficult. For example, one would likely do fairly well
`predicting in-hospital mortality using a simplified set of input features
`that includes where the patient is admitted (ICU vs. hospital floor), vital
`signs trajectories during the first 48 h (higher death rates are expected in
`patients who become hypotensive soon after admission), and whether
`the patient requires mechanical ventilation or inotropic support soon
`after admission. As the authors do not compare their method to what
`would be obtained using a simple method such as logistic regression
`model using a rich set of clinical features, it is not clear whether a CNN
`is truly necessary for this task.
`One very popular data source for machine learning is the electrocar-
`diogram because it
`is routinely measured, cheap to administer, and
`apparently rich in information, some of which may not be easily dis-
`cernable by humans. In addition, a variety of deep learning methods
`exist
`that can effectively deal with time series data, much like that
`arising from a single lead and multiple lead ECGs. Many of
`these
`approaches have already been applied to the interpretation and classifi-
`cation of electroencephalographic signals [50].
`Attia et al. also mined the ECG for new information by attempting to
`predict left ventricular systolic dysfunction from the 12-lead ECG and
`transthoracic echocardiogram (TTE) using a convolutional neural net-
`work [51]. As LV dysfunction itself is a powerful predictor of subsequent
`heart failure, the resulting network indirectly identifies patients at ele-
`vated risk of adverse events [52]. By traditional statistical metrics (e.g.
`AUC) their classifier performed extremely well, with some exceptions
`(positive predictive value). The low positive predictive value (PPV) tells
`us that the model has many false positives, but, crucially, this does not
`help us predict when the model will fail; i.e., for which type of patients.
`The work also does not provide insights on the details of the relation-
`ship between the ECG and ALVD. For example, some determination
`about what segments of the ECG contribute to the prediction would
`be highly informative and of scientific interest.
`Myers et al. applied a recurrent neural network (RNN)—a structure
`used to analyze time-series data—to continuous ECG data, along with a
`set of patient features, to predict the risk of death 1 year after non-ST
`segment elevation myocardial infarction (NSTEMI) [53]. For these stud-
`ies, samples from the ST segments of each beat were identified and
`extracted in an automated fashion and then used as input
`to the
`RNN. The resulting neural network, which incorporates information
`from approximately 1 min of continuous ECG data, had improved
`predictive and discriminatory ability relative to a logistic regression
`model that used the same patient features and summary information
`from the admission 12-lead ECG. Nevertheless, the complexity of the
`model makes it difficult
`to understand precisely how and why the
`model arrives at a particular result. Consequently, while the model itself
`
`10
`
`

`

`Curr Treat Options Cardio Med (2020) 22:15
`
`Page 11 of 14 15
`
`has improved performance relative to existing methods, the ultimate
`clinical utility of the method remains to be determined.
`
`“All models are wrong, but some are useful”
`
`The recent, notable successes of deep learning approaches argue that they will
`have place in the pantheon of methods used to build risk stratification models.
`However, it is not always clear when these approaches should be chosen

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket