(HealthDay News) — Large language models (LLMs) are approaching expert-level knowledge and reasoning skills in ophthalmology, according to a study published online April 17 in PLOS Digital Health.

Arun James Thirunavukarasu, MB, BChir, from University of Oxford in the United Kingdom, and colleagues evaluated the clinical potential of state-of-the-art LLMs in ophthalmology. Responses to 87 questions were compared for GPT-3.5, GPT-4, PaLM 2, LLaMA, expert ophthalmologists, and doctors in training.

The researchers found that the performance of GPT-4 (69%) was superior to performance of GPT-3.5 (48%), LLaMA (32%), and PaLM 2 (56%) and compared favorably with expert ophthalmologists (median, 76%), ophthalmology trainees (median, 59%), and unspecialized junior doctors (median, 43%). Low agreement between LLMs and doctors was due to idiosyncratic differences in knowledge and reasoning, with overall consistency across individuals and type. Grading ophthalmologists preferred GPT-4 responses over GPT-3.5 due to higher accuracy and relevance.

“LLMs are approaching expert-level ophthalmological knowledge and reasoning, and may be useful for providing eye-related advice where access to health care professionals is limited,” the authors write. “Further research is required to explore potential avenues of clinical deployment.”

One author disclosed a patent on a deep learning system to detect retinal disease.

Abstract/Full Text