AI outplays physicians at informing patients and feeling their pain too

Virtually going toe-to-toe with medical doctors taking questions from real-world patients, a generative AI chatbot gave superior medical advice almost 80% of the time.  

What’s more, response evaluators consistently scored the technology—yes, a ChatGPT model—higher than the doctors for empathy and, with it, bedside manners.

The online competition was conducted by John W. Ayers, PhD, and colleagues at UC-San Diego. It’s described in JAMA Internal Medicine. Here’s more.

  • Evaluators didn’t know which responses came from which contender, ChatGPT vs. verified doctor. The research team used textual material from around 200 randomly grabbed 1-to-1 exchanges at Reddit’s r/AskDocs forum.
     
  • The blinded evaluator panel consisted of four physicians and one nurse practitioner. Together they brought experience in pediatrics, geriatrics, internal medicine, oncology, infectious disease and preventive medicine.
     
  • For each blinded Q&A pair, the judges picked one response as the “better” of the two. They rated information quality on a 5-point scale—very good, good, acceptable, poor or very poor—and did the same for perceived sensitivity to patients’ feelings (very empathetic, empathetic, moderately empathetic, slightly empathetic or not empathetic).
     
  • For overall informational quality—accuracy, thoroughness, usability—chatbot responses topped those from physicians in 78.6% of 585 evaluations. Indeed, the chatbot rang up scores of “good” or “very good” 3.6 times more than did the physicians.
     
  • The chatbot trounced the doctors at making patients feel heard. The panel deemed the chatbot empathetic or very empathetic at a rate some 9.8 times higher than the physicians managed.

The authors conclude:

Further exploration of this technology is warranted in clinical settings, such as using [a ChatGPT] chatbot to draft responses that physicians could then edit. Randomized trials could assess further if using AI assistants might improve responses, lower clinician burnout and improve patient outcomes.”

Read the full study here and coverage by UC-San Diego’s news team here.

Dave Pearson

Dave P. has worked in journalism, marketing and public relations for more than 30 years, frequently concentrating on hospitals, healthcare technology and Catholic communications. He has also specialized in fundraising communications, ghostwriting for CEOs of local, national and global charities, nonprofits and foundations.