Data scientists, doctors, and marketing departments disagree on the potential of artifical intelligence for medicine.
When William James Mayo wrote in 1928 that the ideal of medicine is to eliminate the need for a physician, he probably wasn’t thinking about artificial intelligence (AI)—it would be nearly 30 years before the term was coined. As machines grow more and more capable, many believe that Mayo’s ideal could come at the hands of an algorithm.
A joint survey from researchers at Oxford and Yale found that many scientists believe the arrival of high-level machine intelligence (HLMI)—the point in time when unaided machines can accomplish tasks better and more cheaply than humans—isn’t far off.
AI researchers believe HLMI won’t replace them for at least 80 years, but their aggregate consensus was that other professions might not last as long. They believe surgeons could be replaced by computers in about 45 years, and that an AI-generated best-selling novel is even closer to reality.
PR departments and tech journalists are championing the impending AI revolution in healthcare, but voices inside the health industry disagree on whether technology can master the art of medicine.
The Peak of Expectations
Stanford’s Jonathan H. Chen, MD, PhD, says he has seen his colleagues embrace the hype. “I’m hearing very smart people buying into these ideas, and they don’t quite know what they’re buying into,” he told Healthcare Analytics News™.
In June, Chen and his Stanford colleague Steven M. Asch, MD, MPH, published a commentary in the New England Journal of Medicine (NEJM) that struck a measured tone on AI’s role in medicine. The piece embraces the potential of big data, AI, and machine learning for medicine, but spotlights a series of sticking points that often go unmentioned. They argue that medicine is at the peak of inflated expectations and should avoid a dive into the “trough of disillusionment” by acknowledging limitations alongside capabilities.
“Whether such artificial intelligence systems are ‘smarter’ than human practitioners makes for a stimulating debate—but is largely irrelevant,” they wrote.
Even more problematic for AI is when it can’t deliver on its high expectations. “When overhype and promises don’t get realized, there’s a backlash afterwards...It becomes harder to invest in the longer-term work because people shy away if they feel like they didn’t get what was promised. People just don’t have the right expectations at times,” Chen explained. “I have heard people say ‘Oh, we have so much data, our algorithms are so advanced…pretty soon we’ll be able to predict anything that’s going to happen to patients. Ten years in the future we’ll know what’s going to happen to them.’ That’s where I just go, ‘No, I’m sorry, but you can’t.’”
Hype, Disappointment, and Backlash
The best-known engine for this has been Watson. Healthcare organizations have hitched their wagons to IBM’s supercomputer, hoping that the first and only nonhuman Jeopardy! champion will help find better, cheaper routes for diagnosing and treating cancer and other complex diseases. In some cases, it has paid off: Memorial Sloan Kettering Cancer Center’s (MSK) partnership with Watson Oncology has developed into a commercial platform that has spread to other hospitals worldwide, although only 1 other US hospital system has adopted it.
But there’s a caveat to that. Critics accuse the platform of treatment bias, because Watson for Oncology is entirely trained at MSK in New York. The regimens that oncologists at one American institution recommend to their (often wealthier) patient base do not necessarily represent the best course of treatment for international patients and doctors.
Other situations have been worse. Just days before the annual Healthcare Information and Management Systems Society meeting, during which IBM was set to announce new hospital partnerships, it was revealed that Watson’s collaboration with The University of Texas MD Anderson Cancer Center in Houston had been terminated in 2016—although it was always posed as part of MD Anderson’s Moon Shots program that the 2 agreed to develop in 2012, it turned into a costly debacle. An internal University of Texas audit found more than $62 million in expenditures with next to nothing to show for them. IBM had quietly ceased support of the project beforehand, in September of 2016.
After the program’s failure, media coverage was excoriated for adding to inflated expectations. Health News Review, a health journalism watchdog site, looped in everyone from Forbes to Scientific American as complicit in pushing a narrative that Watson was somehow closer to providing actionable cancer treatment advice than it actually was.
Jonathan Chen pointed to the situation as another motivating factor in penning the NEJM commentary. “I actually like what IBM Watson is doing,” he said, but the situation shows that the backlash is already starting to happen. “Call it a market correction…we’re seeing that there’s great potential and the vision is awesome, but it doesn’t mean it’s easy, and there will be missteps along the way. It really exposed what’s true in most of these cases.”
Venture capitalist Chamath Palihapitiya created a stir earlier this year when he blasted the brand on CNBC, calling it a joke. “The companies that are advancing machine learning and AI don’t brand it with some nominally specious name after a Sherlock Holmes character,” he said, indicating that it was perhaps the best-marketed AI platform, but not necessarily the most advanced.
In a statement, IBM sniped back that “Watson is not a consumer gadget but the AI platform for real business. Watson is in clinical use in the [United States] and 5 other countries. It has been trained on 6 types of cancers with plans to add 8 more this year,” closing with, “Does any serious person consider saving lives, enhancing customer service, and driving business innovation a joke?”
The Challenges Ahead
Watson’s major challenge lies in finding a way to make its work actionable. Healthcare’s electronic medical record (EMR) and siloing headaches are a well-trod topic, and they were also another factor in the dissolution of MD Anderson’s Watson program. A change in EMR providers during development left the AI platform unable to integrate with the hospital’s new data.
Experts point to data sharing and quality as major setbacks to AI’s medical potential. When DeepMind found itself under legal scrutiny for improperly possessing 1.6 million National Health Systems (in the United Kingdom) patient records, it indicated that the data quality and consistency were inadequate for any meaningful application of their AI technologies anyway.
“We need to have consistency in the structured data we are using,” said Debra Patt, MD, of the US Oncology Network. She noted that companies such as Flatiron, which mines oncology information for clues to better treatment, must still employ human chart reviewers to pore through clinical information that is unstructured.
Even in a hypothetical world where health records are formalized, standardized, and universally trustworthy, there would still be lots of ground to cover, Chen said. “We can crunch data all day, and we do. We find the next algorithm that can predict this or predict that, and I can publish all sorts of research papers on it, and I’ll keep my job going,” he said. “The algorithm predicts that my patient has a high risk of death. OK, so what do I do? I think the patient needs to get some other type of treatment option, so how do I tell them that? How do I get them to take the medication? How do I follow up and make sure our intervention had the result we intended to? How do we monitor the patients afterwards and get the outcomes to continue to optimize the system?”
Glimmers of Precedent
For Morris Panner, the CEO of Ambra Health and a fervent healthcare futurist, the key is getting AI to go beyond merely executing the algorithms fed to it by its makers. “It’s obviously a utopian framework, but we’re starting to imagine a world where computers are now true scientific collaborators rather than devices that can augment work mechanically,” he explained. “When somebody comes up with a way to get computers to start finding new ideas to cure disease, that would start executing on Mayo’s vision of ridding the world of illness,” he added.
It is equally important that healthcare AI not be confused simply for the aims, vision, successes, and failures of its most visible platform. Whether IBM can make something of its much-marketed thinking machine or not, AI for healthcare is poised for tremendous growth. The market was worth less than $1 billion worldwide in 2016, but has been projected in recent reports to churn at a compound annual growth rate of 40% or greater over the next 5 to 7 years, reaching 10 times its present value.
That’s a big reason why upstarts and established tech giants are increasingly offering AI suites for medical applications. DeepMind Health, a London-based firm, is now owned by Google. In 2015, Intel got into the space by buying Saffron, a business-minded AI platform with a range of applications in addition to healthcare.
Of course, those alternatives aren’t without their hype. When Microsoft announced its own AI initiative, Project Hanover, in 2016, it claimed that its scientists and researchers were working to “solve” cancer—a lofty and ambiguous goal.
There are also the smaller, specific entries. CloudMedX gears its tech to look at the most common causes of death, including heart failure, chronic obstructive pulmonary disease, stroke, and diabetes. There’s also the Morpheo initiative, launched by French firm Rythm earlier this year to study sleep disorders. It’s a collaboration with academics, data scientists, and developers.
“IBM is trying to change its way, and paying a lot of money to market things properly. They are pushing things very much in the headlines, and there are problems with that right now,” Mathieu Galtier said in an interview. Galtier is head of algorithms and research and the project lead for Morpheo. He contrasted his group’s approach with those of larger American tech firms, saying that the collaborative, open-source nature of the project allows them to be “better attached to the real problems within the hospitals.”
The technology also makes heavy use of blockchain, which he said adds transparency and traceability. In turn, he hopes that will help them to validate their effort’s algorithms, breed trust, and gain better access to more data. Its techniques and algorithms will be applied to research beyond just sleep in the coming year he said, as Rythm looks to bring the same open-source, blockchain-constrained AI to oncology. They are currently raising money for those efforts.
“For the average medical doctor, the artificial intelligence is going to come sooner rather than later. I strongly believe I will see it in my lifetime, a machine doing diagnosis and treatment,” Galtier said. “I think artificial intelligence is really promising in health, but still it needs to be orchestrated and conducted in a traceable, clean way.”
When given pure unbiased data, time to learn, and a defined goal, AI systems can amaze. AlphaGo, for example, was a DeepMind project that could beat an accomplished human player at the ancient board game Go. Essentially, it developed its own approach.
Although Go is a game with confined rules, its possible outcomes are almost unfathomable—indeed a finite number, but one with exponentially more digits than that representing possible outcomes in chess. The game is an imperfect analogue for the endless scenarios that a doctor might see when evaluating and treating a patient, but an encouraging benchmark for what algorithms can achieve.