It’s 2019 and your car can drive itself most of the way to your doctor’s office, but once there, you will be handed a clipboard with a paper form asking you for your name, date of birth, and insurance information. Then you will wait to be seen by a doctor, who will spend your visit facing a screen transcribing your spoken complaint into the EMR, and then ask you where you’d like your lab results faxed.
How can it be that technology is making such huge strides in some areas of our lives, while others are seemingly stuck in the last century? When will healthcare have its self-driving moment?
The Promise of Self-Driving Cars
Self-driving cars are a great example of how computer science has been applied to solve difficult real-world problems. It was less than 15 years ago that the computer science community celebrated the first autonomous vehicles successfully completing a 130 mi course in under 10 hours as part of the DARPA grand challenge. Most of the heavily-funded university research teams that entered used traditional programming techniques. The winner of this competition, Stanford University, was characterized by its use of machine learning to train a vehicle by example, rather than writing the if-then-else code by hand.
Since this time, machine learning generally, and deep neural networks in particular, have proven to be unreasonably effective in solving problems with huge and highly complex inputs like image recognition, sensor integration and traffic prediction, among others. Companies like Waymo, Volvo, Uber, and Tesla have been pouring money into the autonomous vehicle space and making rapid progress. Many cars sold today come with some level of assisted driving like lane holding and collision prevention, and Tesla vehicles even come with a “Full Self Driving” option.
Machine Learning in Healthcare
So, what about healthcare? People’s health is a highly complex function of genetics, medicine, diet, exercise, and a number of other lifestyle factors. In the same way you make thousands of little steering corrections to stay in a lane, you make thousands of choices each day that impact your susceptibility to disease, quality of life, and longevity to name a few. Can the same toolset that can help cars drive themselves help us build good predictive models for health and healthcare?
There have certainly been efforts. Including some high profile failures. One big limitation is the data. On the one hand, healthcare is awash in data. Some claim it won’t even fit in the cloud (spoiler: it will). Much of the data in healthcare today is locked up in EMR systems. Once you’ve liberated it from the EMR, the next problem is that it’s not a great input for machine learning algorithms. A recent study in JAMA focused on applications of ML in healthcare found that EMR data had big data quality issues, and that models learned on one EMR’s dataset were not transferrable to another EMR’s dataset, severely limiting the portability of models. Imagine trying to build a self-driving car with partial and incompatible maps from each city and you’ll start to understand the problem with using EMR data to train ML models.
All The Wrong Data
But more important than this: even if the EMR data was clean and consistent, a big piece of the puzzle is missing: the data about the person when they’re not in the doctor’s office. We know that a person’s health is influenced largely by their lifestyle, diet, and genetics. But we largely don’t have good datasets for this yet.
You can’t build a self-driving car no matter how much many fluid level measurements and shop records you have: “I don’t know why that one crashed, its oil was changed just 4 weeks ago. And with a fresh air filter too!” You also can’t build meaningful healthcare ML models with today’s EMR-bound datasets. Sure, there will be some progress in constrained problems like billing optimization, workflow, and diagnostics (particularly around imaging), but big “change the world” progress will fundamentally require a better dataset.
There are efforts to begin collecting parts of this dataset, with projects like Verily’s Project Baseline, and the recently-failed Arivale. Baseline, and others like it will take years or decades to come to fruition as they track how decisions made today impact a person many years down the line.
On a more modest scale, at Wellpepper we believe that driving high-quality and patient-centered interactions outside the clinic is a major key to unlocking improved health outcomes. This required us to start by building a communication channel between patients and their care providers to help patients follow their care plans. Using Wellpepper, providers can assign care plans, and patients can follow along at home and keep track of health measures, track symptoms, and send messages. Collecting this data in a structured way opens the door to understanding and improving these interactions over time.
For example, using regression analysis we were able to determine certain patterns in post-surgical side-effects that indicate a 3x risk of readmissions. And more recently we trained a machine learned classifier for unstructured patient messages that can help urgent messages get triaged faster. And this is just scratching the surface. Since this kind of patient-centric data from outside the clinic is new, we expect that there is a large greenfield of discovery as we collect more data in more patient care scenarios.
Better patient-centric data combined with state of the art machine learning algorithms hold huge promise in healthcare. But we need to invest in collecting the right patient-centric datasets, rather than relying on the data that happens to be lying around in the EMR.