AI’s role in self-diagnosing: Is it reliable?

by Avika Anand
South Forsyth High School

Photo by Luca Sammarco on Pexels.com

How often have you researched the symptoms of the “life-threatening disease” AI had diagnosed you with? You aren’t alone. The National Library of Medicine recorded that out of a sample of 476 people, 78.4% of people are willing to look to ChatGPT to diagnose themselves.

But is ChatGPT really a doctor? Can we trust a bot to dictate our medications and treatments? An exploratory study put Open AI to the test, assessing the accuracy of the diagnoses of various orthopedic diseases based on listed symptoms. It found that ChatGPT was able to diagnose some conditions with 100% accuracy while others were limited to less than 10%. Interestingly, as reported by a study for the Journal of Medical Internet Research, ChatGPT reported incorrect answers with unwavering confidence, making itself more believable and “reliable”. A study conducted in Canada found that only 31% of ChatGPT’s answers to a collection of medical questions derived from a medical licensing examination were correct and only 34% of answers were clear or understandable by the readers. An Australian study connects these “understandable” responses with the omission of critical information, leading to misunderstandings regarding the health of the user. Dr. Andrea Dabney, an OB/GYN based in Georgia says, “Sometimes the search engines are in the right ballpark for some ideas of what it can be. But I haven’t ever found that they’re specific. It’s a physical exam as well as getting an actual test done that helps seal the diagnosis.”

Another study tested the reliability of AI models to recommend healthcare providers “related” to the condition described by the user. ChatGPT, Google Bard, and Bing Chat showed significant bias when recommending practitioners. Not only were the doctors recommended primarily in metropolitan areas around the United States, but there was a tendency to avoid recommending female practitioners. Most of the practitioners were in academic medicine as well.

Other statistics report that AI tends to give advice rather than referrals. This practice can often lead users to blindly follow AI advice as there is no outside source recommended. Should AI utilize the prompt as a call-to-action for the user to seek medical attention while also providing a reliable and accessible source, the outcome of asking AI for medical advice would be significantly better.

Other situations, as reported by the Canadian study referred to earlier, involve AI assuring users that everything is okay when in reality, there is an underlying medical issue. This assurance often leads to users ignoring symptoms or deeming themselves paranoid when in reality, they could be getting treatment and clarity. With the abundance of time sensitive sicknesses, it is crucial that diagnosis occur as soon as possible to increase reversibility.

Upon the detection of symptoms or discomfort, contacting a licensed healthcare provider is crucial. However, when both AI resources and healthcare providers are consulted, there is a possibility for discrepancy. “It’s difficult to pull that away from them once they have it in their head until you have a longer discussion. And that even involves how much of a rapport you already have with that patient,” says Dr. Dabney. “Sometimes you really do have to do further workup to disprove to patients it’s not something else.” Still, healthcare providers such as Dr. Dabney and Dr. Kirpilani agree that it is always more beneficial to review AI’s response with a healthcare provider to either confirm or disprove diagnoses or assurances. Dr. Dabney adds, “as a clinician, you add other things in like [a patient’s] past medical history and their family history and a lot of other things that aren’t pulled in when they start Googling a symptom.”

“Make sure that you’re actually trying to get in with a health provider.”

Leave a comment