| | | Today’s go-to generative AI—namely ChatGPT-4—is pretty darned good at parsing out probable diseases in difficult-to-diagnose patient cases. Harvard researchers at Beth Israel Deaconess Medical Center came to the conclusion by challenging a ChatGPT bot to give differential diagnoses for 70 particularly problematic cases. The cases were previously selected for educating physicians at 2023 conferences by the New England Journal of Medicine. (These teaching resources are best known by shorthand as clinicopathologic conferences.) Here are answers to five questions raised by the present research. - What’s the deal with differential? Differential diagnoses generally use lists to name conditions that might be causing the puzzling set of signs and symptoms. The lists are typically arranged in order of most to least likely final diagnosis. The approach takes into account medical histories, lab results and imaging findings.
- How good is ChatGPT at this? In the present study, published June 15 in the Journal of the American Medical Association (JAMA), Chat GPT-4 came back with the same final diagnosis as expert physicians at a 39% clip (27 of 70 cases). Meanwhile the technology included the final diagnosis in 64% of its differential lists (45 of 70 cases).
- What benchmarks exist to place the new findings in context? ChatGPT’s performance compares favorably with that of earlier differentiator tools based in natural language processing (NLP). The authors cite a 2022 study showing an impressive rate of correct final diagnoses, 58% to 68%, while noting that forerunner’s only measure of quality was a “useful” vs. “not useful” binary. By comparison, in the present study, ChatGPT gave a “numerically superior mean differential quality score,” the Beth Israel Deaconess researchers report.
- How solid is the new evidence? The authors—Zahir Kanjee, MD, MPH, Byron Crowe, MD, and Adam Rodman, MD, MPH—acknowledge some limitations in their study design. These included a touch of subjectivity in the outcome metrics and a lack of some important diagnostic information in the patient cases due to protocol limitations. On the other hand, if anything, this deficiency probably showed up as an underestimation of the model’s capabilities, they suggest.
- What is the upshot? Generative AI is “a promising adjunct to human cognition in diagnosis,” the authors conclude.
Additional author commentary: “The model evaluated in this study, similar to some other modern differential diagnosis generators, is a diagnostic ‘black box.’ Future research should investigate potential biases and diagnostic blind spots of generative AI models.”
Clinicopathologic conferences like those from NEJM “are best understood as diagnostic puzzles,” the authors add. “Once privacy and confidentiality concerns are addressed, studies should assess performance with data from real-world patient encounters.” |
| | |
| |
| | | Buzzworthy developments of the past few days. - More than half of 10,000 adults in 13 countries not only know about generative AI but also have tried it. Or at least that’s what some 51% of respondents told surveyors who asked. So reports the Paris-headquartered Capgemini Research Institute, whose market researchers further found adoption of ChatGPT and its ilk “remarkably consistent across age groups and geographies, with over half of all generations, including Baby Boomers, having used the technology.” In healthcare, 67% of consumers around the world trust generative AI to offer helpful medical advice, Capgemini found. June 19 news release here, PDF of full report here.
- Using healthcare AI safely and efficaciously in Australia is going to require retraining the healthcare workforce. That’ll be best done while simultaneously retooling health services and transforming workflows. Only then will AI be sufficiently “medical grade” to justify broad adoption for real-world patient care Down Under. Citing a healthcare AI roadmap produced by the Australian Alliance for AI in Healthcare, three researchers make a compelling case for such restrictive guardrails in an opinion piece published by the Medical Journal of Australia and in a conversational essay posted in The Conversation.
- It’s tempting to view the U.S. vs. China ‘AI race’ as an echo of the space race between the U.S. and the U.S.S.R. That view is misguided. So suggest two writers with the Carnegie Endowment for International Peace. Airing out their thinking in Foreign Policy, the authors predict the probable winner of the AI race will be none other than AI itself. “As AI advances and diffuses throughout society,” they contend, “it will challenge the United States and its open society as much as—if not more than—China.”
- Could AI lull human brains into growing horribly lazy? It could. Some even say the technology is bound to make more than a few of us downright stupid. The British-American man of letters Simon Winchester is tuning out the doom-and-gloomers. Or maybe he’s putting them in their place. “I see today’s algorithmic revolution as a necessary cleansing, a movement by which we rid ourselves of all the accumulated bricolage of modern intellectual life, returning us to a more reasonable sound-to-noise ratio, gifting us with a renewed innocence, filled with potential,” he writes in The Guardian. “Fanciful though it may sound, this new-made post-AI society could even see the emergence of a new Euclid, a new Plato, a new Herodotus.” Read the whole thing.
- It’s an accumulating problem: AI chatbots are building their knowledge and expertise on copyrighted content and code belonging to actual people. Mercury News business reporter Ethan Baron breaks down the unfolding—and potentially industry-decimating—scenario from a vantage point in the heart of Silicon Valley.
- DiagnaMed (Toronto) has birthed Dr. GenAI. The chatty chatbot uses ChatGPT to help consumers who want to live healthier lives. Announcement.
- SDC (Boston) is splashing a machine learning tool for medical coding in clinical trials. The company, formerly known as Statistics & Data Corp., says the software has an accuracy rate north of 90% when used to predict coded terms for drugs and medical events.
- Bionik Labs (Boston) has installed its robots for stroke recovery support at four Lifepoint Health locations. The sites are in Arizona, California, Indiana and Kentucky. Details here.
- TheMathCompany of Chicago and Bengaluru, India, is partnering with Komodo Health in San Francisco. MathCo says it will tap Komodo’s Healthcare Map and AI know-how to “power a range of patient and provider use cases” for healthcare clients.
|
| | |
|
| |
|
|