throbber
https://doi.org/10.1038/s41591-018-0300-7
`
`High-performance medicine: the convergence of
`human and artificial intelligence
`
`Eric J. Topol 
`
` 
`
`The use of artificial intelligence, and the deep-learning subtype in particular, has been enabled by the use of labeled big data, along
`with markedly enhanced computing power and cloud storage, across all sectors. In medicine, this is beginning to have an impact
`at three levels: for clinicians, predominantly via rapid, accurate image interpretation; for health systems, by improving workflow
`and the potential for reducing medical errors; and for patients, by enabling them to process their own data to promote health.
`The current limitations, including bias, privacy and security, and lack of transparency, along with the future directions of these
`applications will be discussed in this article. Over time, marked improvements in accuracy, productivity, and workflow will likely
`be actualized, but whether that will be used to improve the patient–doctor relationship or facilitate its erosion remains to be seen.
`
`Medicine is at the crossroad of two major trends. The first
`
`is a failed business model, with increasing expenditures
`and jobs allocated to healthcare, but with deteriorating key
`outcomes, including reduced life expectancy and high infant, child-
`hood, and maternal mortality in the United States1,2. This exem-
`plifies a paradox that is not at all confined to American medicine:
`investment of more human capital with worse human health out-
`comes. The second is the generation of data in massive quantities,
`from sources such as high-resolution medical imaging, biosensors
`with continuous output of physiologic metrics, genome sequenc-
`ing, and electronic medical records. The limits on analysis of such
`data by humans alone have clearly been exceeded, necessitating
`an increased reliance on machines. Accordingly, at the same time
`that there is more dependence than ever on humans to provide
`healthcare, algorithms are desperately needed to help. Yet the inte-
`gration of human and artificial intelligence (AI) for medicine has
`barely begun.
`Looking deeper, there are notable, longstanding deficiencies in
`healthcare that are responsible for its path of diminishing returns.
`These include a large number of serious diagnostic errors, mis-
`takes in treatment, an enormous waste of resources, inefficiencies
`in workflow, inequities, and inadequate time between patients and
`clinicians3,4. Eager for improvement, leaders in healthcare and com-
`puter scientists have asserted that AI might have a role in address-
`ing all of these problems. That might eventually be the case, but
`researchers are at the starting gate in the use of neural networks to
`ameliorate the ills of the practice of medicine. In this Review, I have
`gathered much of the existing base of evidence for the use of AI in
`medicine, laying out the opportunities and pitfalls.
`
`Artificial intelligence for clinicians
`Almost every type of clinician, ranging from specialty doctor to
`paramedic, will be using AI technology, and in particular deep
`learning, in the future. This largely involved pattern recognition
`using deep neural networks (DNNs) (Box 1) that can help interpret
`medical scans, pathology slides, skin lesions, retinal images, electro-
`cardiograms, endoscopy, faces, and vital signs. The neural net inter-
`pretation is typically compared with physicians’ assessments using a
`plot of true-positive versus false-positive rates, known as a receiver
`operating characteristic (ROC), for which the area under the curve
`(AUC) is used to express the level of accuracy (Box 1).
`
`Radiology. One field that has attracted particular attention for
`application of AI is radiology5. Chest X-rays are the most common
`
`type of medical scan, with more than 2 billion performed worldwide
`per year. In one study, the accuracy of one algorithm, based on a
`121-layer convolutional neural network, in detecting pneumonia in
`over 112,000 labeled frontal chest X-ray images was compared with
`that of four radiologists, and the conclusion was that the algorithm
`outperformed the radiologists. However, the algorithm’s AUC of
`0.76, although somewhat better than that for two previously tested
`DNN algorithms for chest X-ray interpretation5, is far from optimal.
`In addition, the test used in this study is not necessarily comparable
`with the daily tasks of a radiologist, who will diagnose much more
`than pneumonia in any given scan. To further validate the conclu-
`sions of this study, a comparison with results from more than four
`radiologists should be made. A team at Google used an algorithm
`that analyzed the same image set as in the previously discussed
`study to make 14 different diagnoses, resulting in AUC scores that
`ranged from 0.63 for pneumonia to 0.87 for heart enlargement or
`a collapsed lung6. More recently, in another related study, it was
`shown that a DNN that is currently in use in hospitals in India for
`interpretation of four different chest X-ray key findings was at least
`as accurate as four radiologists7. For the narrower task of detecting
`cancerous pulmonary nodules on a chest X-ray, a DNN that retro-
`spectively assessed scans from over 34,000 patients achieved a level
`of accuracy exceeding 17 of 18 radiologists8. It can be difficult for
`emergency room doctors to accurately diagnose wrist fractures,
`but a DNN led to marked improvement, increasing sensitivity from
`81% to 92% and reducing misinterpretation by 47% (ref. 9).
`Similarly, DNNs have been applied across a wide variety of
`medical scans, including bone films for fractures and estimation of
`aging10–12, classification of tuberculosis13, and vertebral compression
`fractures14; computed tomography (CT) scans for lung nodules15,
`liver masses16, pancreatic cancer17, and coronary calcium score18;
`brain scans for evidence of hemorrhage19, head trauma20, and acute
`referrals21; magnetic resonance imaging22; echocardiograms23,24;
`and mammographies25,26. A unique imaging-recognition study
`focusing on the breadth of acute neurologic events, such as stroke
`or head trauma, was carried out on over 37,000 head CT 3-D scans,
`which the algorithm analyzed for 13 different anatomical find-
`ings versus gold-standard labels (annotated by expert radiologists)
`and achieved an AUC of 0.73 (ref. 27). A simulated prospective,
`double-blind, randomized control trial was conducted with real
`cases from the dataset and showed that the deep-learning algorithm
`could interpret scans 150 times faster than radiologists (1.2 versus
`177 seconds). But the conclusion that the algorithm’s diagnostic
`accuracy in screening acute neurologic scans was poorer than human
`
`Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA. e-mail: etopol@scripps.edu
`
`44
`
`NATurE MEdiciNE | VOL 25 | JANUARY 2019 | 44–56 | www.nature.com/naturemedicine
`
`Review ARticle | FOCUS
`
`https://doi.org/10.1038/s41591-018-0300-7
`
`Review ARticle | FOCUS
`
`AliveCor Ex. 2026 - Page 1
`
`

`

`Box 1 | deep learning
`
`While the roots of AI date back over 80 years from concepts
`laid out by Alan Turing204,205 and Warren McCulloch and Walter
`Pitts206, it was not until 2012 that the subtype of deep learning
`was widely accepted as a viable form of AI207. A deep learning
`neural network consists of digitized inputs, such as an image
`or speech, which proceed through multiple layers of connected
`‘neurons’ that progressively detect features, and ultimately pro-
`vides an output. By analyzing 1.2 million carefully annotated
`images from over 15 million in the ImageNet database, a DNN
`achieved, for that point in time, an unprecedented low error
`rate for automated image classification. That report, along with
`Google Brain’s 10 million images from YouTube videos to accu-
`rately detect cats, laid the groundwork for future progress. With-
`in 5 years, in specific large data-labeled test sets, deep-learning
`algorithms for image recognition surpassed the human accuracy
`rate208,209, and, in parallel, suprahuman performance was demon-
`strated for speech recognition.
`The basic DNN architecture is like a club sandwich turned on
`its side, with an input layer, a number of hidden layers ranging
`from 5 to 1,000, each responding to different features of the
`image (like shape or edges), and an output layer. The layers are
`‘neurons,’ comprising a neural network, even though there is
`little support of the notion that these artificial neurons function
`similarly to human neurons. A key differentiating feature of deep
`learning compared with other subtypes of AI is its autodidactic
`quality; the neural network is not designed by humans, but rather
`
`the number of layers (Fig. 1) is determined by the data itself.
`Image and speech recognition have primarily used supervised
`learning, with training from known patterns and labeled input
`data, commonly referred to as ground truths. Learning from
`unknown patterns without labeled input data—unsupervised
`learning—has very rarely been applied to date. There are many
`types of DNNs and learning, including convolutional, recurrent,
`generative adversarial, transfer, reinforcement, representation,
`and transfer (for review see refs. 210,211). Deep-learning algorithms
`have been the backbone of computer performance that exceeds
`human ability in multiple games, including the Atari video
`game Breakout, the classic game of Go, and Texas Hold’em
`poker. DNNs are largely responsible for the exceptional progress
`in autonomous cars, which is viewed by most as the pinnacle
`technological achievement of AI to date. Notably, except in
`the cases of games and self-driving cars, a major limitation to
`interpretation of claims reporting suprahuman performance of
`these algorithms is that analytics are performed on previously
`generated data in silico, not prospectively in real-world clinical
`conditions. Furthermore, the lack of large datasets of carefully
`annotated images has been limiting across various disciplines in
`medicine. Ironically, to compensate for this deficiency, generative
`adversarial networks have been used to synthetically produce
`large image datasets at high resolution, including mammograms,
`skin lesions, echocardiograms, and brain and retina scans, that
`could be used to help train DNNs212–216.
`
`performance was sobering and indicates that there is much more
`work to do.
`For each of these studies, a relatively large number of labeled
`scans were used for training and subsequent evaluation, with
`AUCs ranging from 0.99 for hip fracture to 0.84 intracranial bleed-
`ing and liver masses to 0.56 for acute neurologic case screening. It
`is not possible to compare DNN accuracy from one study to the
`next because of marked differences in methodology. Furthermore,
`ROC and AUC metrics are not necessarily indicative of clini-
`cal utility or even the best way to express accuracy of the model’s
`performance28,29. Furthermore, many of these reports still only
`exist in preprint form and have not appeared in peer-reviewed pub-
`lications. Validation of the performance of an algorithm in terms of
`its accuracy is not equivalent to demonstrating clinical efficacy. This
`is what Pearse Keane and I have referred to as the ‘AI chasm’—that is,
`an algorithm with an AUC of 0.99 is not worth very much if it is
`not proven to improve clinical outcomes30. Among the studies that
`have gone through peer review (many of which are summarized
`in Table 1), the only prospective validation studies in a real-world
`setting have been for diabetic retinopathy31,32, detection of wrist
`fractures in the emergy room setting33, histologic breast cancer
`metastases34,35, very small colonic polyps36,37, and congenital cata-
`racts in a small group of children38. The field clearly is far from dem-
`onstrating very high and reproducible machine accuracy, let alone
`clinical utility, for most medical scans and images in the real-world
`clinical environment (Table 1).
`
`Pathology
`Pathologists have been much slower at adopting digitization of scans
`than radiologists39—they are still not routinely converting glass
`slides to digital images and use whole-slide imaging (WSI) to enable
`viewing of an entire tissue sample on a slide. Marked heterogene-
`ity and inconsistency among pathologists’ interpretations of slides
`has been amply documented, exemplified by a lack of agreement
`
`Hidden layers
`
`Input layer
`
`Output layer
`
`Data
`
`Fig. 1 | A deep neural network, simplified. Credit: Debbie Maizels/Springer
`Nature
`
`in diagnosis of common types of lung cancer (Κ = 0.41–0.46)40.
`Deep learning of digitized pathology slides offers the potential to
`improve accuracy and speed of interpretation, as assessed in a few
`retrospective studies. In a study of WSI of breast cancer, with or
`without lymph node metastases, that compared the performance of
`11 pathologists with that of multiple algorithmic interpretations, the
`results varied and were affected in part by the length of time that the
`pathologists had to review the slides41. Some of the five algorithms
`performed better than the group of pathologists, who had varying
`expertise. The pathologists were given 129 test slides and had less
`than 1 minute for review per slide, which likely does not reflect nor-
`mal workflow. On the other hand, when one expert pathologist had
`no time limits and took 30 hours to review the same slide set, the
`results were comparable with the algorithm for detecting noninva-
`sive ductal carcinoma42.
`
`NATurE MEdiciNE | VOL 25 | JANUARY 2019 | 44–56 | www.nature.com/naturemedicine
`
`45
`
`NaTure MedIcINe
`
`FOCUS | Review ARticle
`
`FOCUS | Review ARticle
`
`https://doi.org/10.1038/s41591-018-0300-7
`
`NaTure MedIcINe
`
`FOCUS | Review ARticle
`
`AliveCor Ex. 2026 - Page 2
`
`

`

`https://doi.org/10.1038/s41591-018-0300-7
`
`Table 1 | Peer-reviewed publications of Ai algorithms compared
`with doctors
`Specialty
`Radiology/
`neurology
`
`Publication
`Titano et al. 27
`
`images
`CT head, acute
`neurological events
`CT head for brain
`hemorrhage
`CT head for trauma
`CXR for metastatic lung
`nodules
`CXR for multiple findings
`Mammography for breast
`density
`Wrist X-ray*
`Breast cancer
`Lung cancer ( +  driver
`mutation)
`Brain tumors
`( +  methylation)
`Breast cancer metastases* Steiner et al.35
`Breast cancer metastases
`Liu et al.34
`Skin cancers
`Esteva et al.47
`Melanoma
`Haenssle et al.48
`Skin lesions
`Han et al.49
`Diabetic retinopathy
`Gulshan et al.51
`Diabetic retinopathy*
`Abramoff et al.31
`Diabetic retinopathy*
`Kanagasingam et al.32
`Congenital cataracts
`Long et al.38
`Retinal diseases (OCT)
`De Fauw et al.56
`Macular degeneration
`Burlina et al.52
`Retinopathy of prematurity Brown et al.60
`AMD and diabetic
`Kermany et al.53
`retinopathy
`Gastroenterology Polyps at colonoscopy*
`Polyps at colonoscopy
`Echocardiography
`Echocardiography
`
`Arbabshirani et al.19
`
`Chilamkurthy et al.20
`Nam et al.8
`
`Singh et al.7
`Lehman et al.26
`
`Lindsey et al.9
`Ehteshami Bejnordi et al.41
`Coudray et al.33
`
`Capper et al.45
`
`Mori et al.36
`Wang et al.37
`Madani et al.23
`Zhang et al.24
`
`Pathology
`
`Dermatology
`
`Ophthalmology
`
`Cardiology
`
`Zebra Medical
`Bay Labs
`
`July 2018
`June 2018
`
`Neural Analytics May 2018
`
`Table 2 | FdA Ai approvals are accelerating
`company
`FdA Approval
`indication
`Apple
`September 2018
`Atrial fibrillation detection
`Aidoc
`August 2018
`CT brain bleed diagnosis
`iCAD
`August 2018
`Breast density via
`mammography
`Coronary calcium scoring
`Echocardiogram EF
`determination
`Device for paramedic stroke
`diagnosis
`Diabetic retinopathy diagnosis
`MRI brain interpretation
`X-ray wrist fracture diagnosis
`CT stroke diagnosis
`Liver and lung cancer (MRI, CT)
`diagnosis
`CT brain bleed diagnosis
`Atrial fibrillation detection via
`Apple Watch
`MRI heart interpretation
`
`IDx
`Icometrix
`Imagen
`Viz.ai
`Arterys
`
`MaxQ-AI
`Alivecor
`
`April 2018
`April 2018
`March 2018
`February 2018
`February 2018
`
`January 2018
`November 2017
`
`Arterys
`
`January 2017
`
`and the algorithm led to the best accuracy, and the algorithm mark-
`edly sped up the review of slides35. This study is particularly notable,
`as the synergy of the combined pathologist and algorithm interpreta-
`tion was emphasized instead of the pervasive clinician-versus-algo-
`rithm comparison. Apart from classifying tumors more accurately by
`data processing, the use of a deep-learning algorithm to sharpen out-
`of-focus images may also prove useful46. A number of proprietary
`algorithms for image interpretation have been approved by the Food
`and Drug Administration (FDA), and the list is expanding rapidly
`(Table 2), yet there have been few peer-reviewed publications from
`most of these companies. In 2018, the FDA published a fast-track
`approval plan for AI medical algorithms.
`
`Dermatology. For algorithms classifying skin cancer by image
`analysis, the accuracy of diagnosis of deep-learning networks has
`been compared with that of dermatologists. In a study using a
`large training dataset of nearly 130,000 photographic and derma-
`scopic digitized images, 21 US board-certified dermatologists were
`at least matched in performance by an algorithm, which had an
`AUC of 0.96 for carcinoma47 and of 0.94 for melanoma specifically.
`Subsequently, the accuracy of melanoma skin cancer diagnosis by a
`group of 58 international dermatologists was compared with a con-
`volutional neural network; the mean ROCs were 0.79 versus 0.86,
`respectively, reflecting an improved performance of the algorithm
`compared with most of the physicians48. A third study carried out
`algorithmic assessment of 12 skin diseases, including basal cell car-
`cinoma, squamous cell carcinoma, and melanoma, and compared
`this with 16 dermatologists, with the algorithm achieving an AUC
`of 0.96 for melanoma49. None of these studies were conducted in the
`clinical setting, in which a doctor would perform physical inspec-
`tion and shoulder responsibility for making an accurate diagnosis.
`Notwithstanding these concerns, most skin lesions are diagnosed
`by primary care doctors, and problems with inaccuracy have been
`underscored; if AI can be reliably shown to simulate experienced
`dermatologists, that would represent a significant advance.
`
`Ophthalmology. There have been a number of studies comparing
`performance between algorithms and ophthalmologists in diagnosing
`
`Prospective studies are denoted with an asterisk.
`
`Other studies have assessed deep-learning algorithms for clas-
`sifying breast cancer43 and lung cancer40 without direct compari-
`son with pathologists. Brain tumors can be challenging to subtype,
`and machine learning using tumor DNA methylation patterns via
`sequencing led to markedly improved classification compared with
`pathologists using traditional histological data44,45. DNA meth-
`ylation generates extensive data and at present is rarely performed
`in the clinic for classification of tumors, but this study suggests
`another potential for AI to provide improved diagnostic accuracy in
`the future. A deep-learning algorithm for lung cancer digital pathol-
`ogy slides not only was able to accurately classify tumors, but also
`was trained to detect the pattern of several specific genomic driver
`mutations that would not otherwise be discernible by pathologists33.
`The first prospective study to test the accuracy of an algorithm
`classifying digital pathology slides in a real clinical setting was an
`assessment of the identification of presence of breast cancer micro-
`metastases in slides by six pathologists compared with a DNN (that
`had been retrospectively validated34). The combination of pathologists
`
`46
`
`NATurE MEdiciNE | VOL 25 | JANUARY 2019 | 44–56 | www.nature.com/naturemedicine
`
`Review ARticle | FOCUS
`
`NaTure MedIcINe
`
`Review ARticle | FOCUS
`
`Review ARticle | FOCUS
`
`NaTure MedIcINe
`
`AliveCor Ex. 2026 - Page 3
`
`

`

`different eye conditions. After training with over 128,000 retinal
`fundus photographs labeled by 54 ophthalmologists, a neural net-
`work was used to assess over 10,000 retinal fundus photographs
`from more than 5,000 patients for diabetic retinopathy, and the
`neural network’s grading was compared with seven or eight oph-
`thalmologists for all-cause referable diagnoses (moderate or worse
`retinopathy or macular edema; scale: none, mild, moderate, severe,
`or proliferative). In two separate validation sets, the AUC was 0.99
`(refs. 50,51). In a study in which retinal fundus photographs were
`used for the diagnosis of age-related macular degeneration (AMD),
`the accuracy for DNN algorithms ranged between 88% and 92%,
`nearly as high as for expert ophthalmologists52. Performance of a
`deep-learning algorithm for interpreting retinal optical coher-
`ence tomography (OCT) was compared with ophthalmologists for
`diagnosis of either of the two most common causes of vision loss:
`diabetic retinopathy or AMD. After the algorithm was trained on a
`dataset of over 100,000 OCT images, validation was performed in
`1,000 of these images, and performance was compared with six oph-
`thalmologists. The algorithm’s AUC for OCT-based urgent referral
`was 0.999 (refs. 53–55).
`Another deep-learning OCT retinal study went beyond the diag-
`nosis of diabetic retinopathy or macular degeneration. A group of
`997 patients with a wide range of 50 retinal pathologies was assessed
`for urgent referral by an algorithm (using two different types of
`OCT devices that produce 3-D images) and results were compared
`with those from experts: four retinal specialists and four optom-
`etrists, with an AUC for accuracy of urgent referral triage to replace
`false alarm of 0.992. The algorithm did not miss a single urgent
`referral case. Notably, the eight clinicians agreed on only 65% of
`the referral decisions. Errors on the correct referral decision were
`reduced for both types of clinicians by integrating the fundus
`photograph and notes on the patient, but the algorithm’s error rate
`(without notes or fundus photographs) of 3.5% was as good or
`better than all eight experts56. One unique aspect of this study was
`the transparency of the two neural networks used, one for mapping
`the eye OCT scans into a tissue schematic and the other for the
`classifier of eye disease. The user (patient) can watch a video that
`shows what portions of his or her scan were used to reach the algo-
`rithm’s conclusions along with the level of confidence it has for the
`diagnosis. This sets a new bar for future efforts to unravel the ‘black
`box’ of neural networks.
`In a prospective trial conducted in primary care clinics, 900
`patients with diabetes but no known retinopathy were assessed by
`a proprietary system (an imaging device combined with an algo-
`rithm) made by IDx (Iowa City, IA) that obtained retinal fundus
`photographs and OCT and by established reading centers with
`expertise in interpreting these images30,31. The algorithm was used
`at primary care clinics up until the clinical trial was autodidactic
`and thus locked for testing, but it achieved a sensitivity of 87% and
`specificity of 91% for the 819 patients (91% of the enrolled cohort)
`with analyzable images. This trial led to FDA approval of the IDx
`device and algorithm for autonomous detection, that is, without
`the need for a clinician, of ‘more than mild’ diabetic retinopathy.
`The regulatory oversight in dealing with deep-learning algorithms
`is tricky because it does not currently allow continued autodidactic
`functionality but instead necessitates fixing the software to behave
`like a non-AI diagnostic system30. Notwithstanding this point along
`with the unknown extent of uptake of the device, the study repre-
`sents a milestone as the first prospective assessment of AI in the
`clinic. The accuracy results are not as good as the aforementioned
`in silico studies, which should be anticipated. A small prospective
`real-world assessment of a DNN for diabetic retinopathy in primary
`care clinics, with eye exams performed by nurses, led to a high false-
`positive diagnosis rate32.
`While the studies of retinal OCT and fundus images have thus far
`focused on eye conditions, recent work suggests that these images
`
`can provide a window to the brain for early diagnosis of dementia,
`including Alzheimer’s disease57.
`The potential use of retinal photographs also appears to tran-
`scend eye diseases per se. Images from over 280,000 patients were
`assessed by DNN for cardiovascular risk factors, including age,
`gender, systolic blood pressure, smoking status, hemoglobin A1c,
`and likelihood of having a major adverse cardiac event, with vali-
`dation in two independent datasets. The AUC for gender at 0.97
`was notable, indicating that the algorithm could identify gender
`accurately from the retinal photo, but the others were in the range
`of 0.70, suggesting that there may be a signal that, through further
`pursuit, could be useful for monitoring patients for control of their
`risk factors58,59.
`Other less common eye conditions that have been assessed by
`neural networks include congenital cataracts38 and retinopathy of
`prematurity in newborns60, both with accuracy comparable with
`that of eye specialists.
`
`Cardiology. The major images that cardiologists use in practice are
`electrocardiograms (ECG) and echocardiograms, both of which
`have been assessed with DNNs. There is a nearly 40-year history
`of machine-read ECGs using rules-based algorithms with notable
`inaccuracy61. When deep learning was used to diagnose heart attack
`in a small retrospective dataset of 549 ECGs, a sensitivity of 93%
`and specificity of 90% were reported, which was comparable with
`cardiologists62. Over 64,000 one-lead ECGs (from over 29,000
`patients) were assessed for arrhythmia by a DNN and six cardiolo-
`gists, with comparable accuracy across 14 different electrical con-
`duction disturbances63. For echocardiography, a small set of 267
`patient studies (consisting of over 830,000 still images) were classi-
`fied into 15 standard views (such as apical 4-chamber or subcostal)
`by a DNN and by cardiologists. The overall accuracy for single still
`images was 92% for the algorithm and 79% for four board-certified
`echocardiographers, but this does not reflect the real-world reading
`of studies, which are in-motion video loops23. An even larger retro-
`spective study of over 8,000 echocardiograms showed high accu-
`racy for classification of hypertrophic cardiomyopathy (AUC, 0.93),
`cardiac amyloid (AUC, 0.87), and pulmonary artery hypertension
`(AUC, 0.85)24.
`
`Gastroenterology. Finding diminutive (< 5 mm) adenomatous or
`sessile polyps at colonoscopy can be exceedingly difficult for gastro-
`enterologists. The first prospective clinical validation of AI was per-
`formed in 325 patients who collectively had 466 tiny polyps, with an
`accuracy of 94% and negative predictive value of 96% during real-
`time, routine colonoscopy36,64. The speed of AI optical diagnosis was
`35 seconds, and the algorithm worked equally well for both novice
`and expert gastroenterologists, without the need for injecting dyes.
`The findings of enhanced speed and accuracy were replicated in
`another independent study37. Such results are thematic: machine
`vision, at high magnification, can accurately and quickly interpret
`specific medical images as well as or better than humans.
`
`Mental health. The enormous burden of mental health, such as the
`350 million people around the world battling depression74, is espe-
`cially noteworthy, as there is potential here for AI to lend support to
`the affected patients and the vastly insufficient number of clinicians.
`Various tools that are in development include digital tracking of
`depression and mood via keyboard interaction, speech, voice, facial
`recognition, sensors, and use of interactive chatbots75–80. Facebook
`posts have been shown to predict the diagnosis of depression later
`documented in electronic medical records81.
`Machine learning has been explored for predicting success-
`ful antidepressant medication82, characterizing depression83–85,
`predicting suicide83,86–88, and predicting bouts of psychosis in
`schizophrenics89.
`
`NATurE MEdiciNE | VOL 25 | JANUARY 2019 | 44–56 | www.nature.com/naturemedicine
`
`47
`
`NaTure MedIcINe
`
`FOCUS | Review ARticle
`
`FOCUS | Review ARticle
`
`https://doi.org/10.1038/s41591-018-0300-7
`
`NaTure MedIcINe
`
`FOCUS | Review ARticle
`
`AliveCor Ex. 2026 - Page 4
`
`

`

`https://doi.org/10.1038/s41591-018-0300-7
`
`Embryo
`selection
`for IVF
`
`Genome
`interpretation
`sick newborns
`
`Voice medical
`coach via a smart
`speaker (like Alexa)
`
`K+
`
`Mental
`health
`
`Paramedic
`dx of heart
`attack, stroke
`
`Assist reading
`of scans,
`slides, lesions
`
`Prevent
`blindness
`
`Classify
`cancer, identify
`mutations
`
`Promote
`patient safety
`
`Predict
`death
`in-hospital
`
`Fig. 2 | Examples of Ai applications across the human lifespan. dx, diagnosis; IVF, in vitro fertilization K+, potassium blood level. Credit: Debbie Maizels/
`Springer Nature
`
`In addition to data from electronic health records, imaging has
`been integrated to enhance predictive accuracy98. Multiple stud-
`ies have attempted to predict biological age110,111, and this has been
`shown to best be accomplished using DNA methylation–based
`biomarkers112. With respect to the accuracy of algorithms for pre-
`diction of biological age, the incompleteness of data input is note-
`worthy, since a large proportion of unstructured data—the free text
`in clinician notes that cannot be ingested from the medical record—
`has not been incorporated, and neither have many other modalities
`such as socioeconomic, behavioral, biologic ‘-omics’, or physiologic
`sensor data. Further, concerns have been raised about the potential
`
`The use of AI algorithms has been described in many other clini-
`cal settings, such as facilitating stroke, autism or electroencepha-
`lographic diagnoses for neurologists65,66, helping anesthesiologists
`avoid low oxygenation during surgery67, diagnosis of stroke or heart
`attack for paramedics68, finding suitable clinical trials for oncolo-
`gists69, selecting viable embryos for in vitro fertilization70, help mak-
`ing the diagnosis of a congenital condition via facial recognition71
`and pre-empting surgery for patients with breast cancer72. Examples
`of the breadth of AI applications across human lifespan is shown in
`Fig. 2.There is considerable effort across many startups and estab-
`lished tech companies to develop natural language processing to
`replace the need for keyboards and human scribes for clinic vis-
`its73. The list of companies active in this space includes Microsoft,
`Google, Suki, Robin Healthcare, DeepScribe, Tenor.ai, Saykara,
`Sopris Health, Carevoice, Orbita, Notable, Sensely and Augmedix.
`
`Artificial intelligence and health systems
`Being able to predict key outcomes could, theoretically, make the
`use of hospital palliative care resources more efficient and precise.
`For example, if an algorithm could be used to estimate the risk of a
`patient’s hospital readmission that would otherwise be undetectable
`given the usual clinical criteria for discharge, steps could be taken
`to avert discharge and attune resources to the underlying issues.
`For a critically ill patient, a very high likelihood of short-term sur-
`vival might help this patient and their family and doctor make deci-
`sions regarding resuscitation, insertion of an endotracheal tube for
`mechanical ventilation, and other invasive measures. Similarly, it is
`possible that deciding which patients might benefit from palliative
`care and determining who is at risk of developing sepsis or septic
`shock could be ameliorated by AI predictive tools. Using electronic
`health record data, machine- and deep-learning algorithms have been
`able to predict many important clinical parameters, ranging from
`Alzheimer’s disease to death (Table 3)86,90–107. For example, in a recent
`study, reinforcement learning was retrospectively carried out on two
`large datasets to recommend the use of vasopressors, intravenous
`fluids, and/or medications and the dose of the selected treatment for
`patients with sepsis; the treatment selected by the ‘AI Clinician’ was
`on average reliably more effective than that chosen by humans108.
`Both the size of the cohorts studied and the range of AUC accuracy
`reported have been quite heterogeneous, and all of these reports are
`retrospective and yet to be validated in the real-world clinical setting.
`Nevertheless, there are many companies that are already marketing
`such algorithms, such as Careskore, which is providing health sys-
`tems with estimated of risk of readmission and mortality based on
`EHR data109. Beyond this issue, there are the differences between the
`prediction metric for a cohort and an individual prediction metric.
`If a model’s AUC is 0.95, which most would qualify as very accurate

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket