Pubmed:
Performance of ChatGPT-4 Omni and Gemini 1.5 Pro on Ophthalmology-Related Questions in the Turkish Medical Specialty Exam

dc.contributor.authorSabaner, M.C.
dc.contributor.authorYozgat, Z.
dc.date.accessioned2025-08-25T05:54:13Z
dc.date.issued2025
dc.description.abstractObjectives: To evaluate the response and interpretative capabilities of two pioneering artificial intelligence (AI)-based large language model (LLM) platforms in addressing ophthalmology-related multiple-choice questions (MCQs) from Turkish Medical Specialty Exams. Materials and methods: MCQs from a total of 37 exams held between 2006-2024 were reviewed. Ophthalmology-related questions were identified and categorized into sections. The selected questions were asked to the ChatGPT-4o and Gemini 1.5 Pro AI-based LLM chatbots in both Turkish and English with specific prompts, then re-asked without any interaction. In the final step, feedback for incorrect responses were generated and all questions were posed a third time. Results: A total of 220 ophthalmology-related questions out of 7312 MCQs were evaluated using both AI-based LLMs. A mean of 6.47±2.91 (range: 2-13) MCQs was taken from each of the 33 parts (32 full exams and the pooled 10% of exams shared between 2022 and 2024). After the final step, ChatGPT-4o achieved higher accuracy in both Turkish (97.3%) and English (97.7%) compared to Gemini 1.5 Pro (94.1% and 93.2%, respectively), with a statistically significant difference in English (p=0.039) but not in Turkish (p=0.159). There was no statistically significant difference in either the inter-AI comparison of sections or interlingual comparison. Conclusion: While both AI platforms demonstrated robust performance in addressing ophthalmology-related MCQs, ChatGPT-4o was slightly superior. These models have the potential to enhance ophthalmological medical education, not only by accurately selecting the answers to MCQs but also by providing detailed explanations.
dc.identifier.doi10.4274/tjo.galenos.2025.27895
dc.identifier.endpage185
dc.identifier.issue4
dc.identifier.pubmed40838476
dc.identifier.startpage177
dc.identifier.urihttps://hdl.handle.net/20.500.12597/34782
dc.identifier.volume55
dc.language.isoen
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectArtificial intelligence
dc.subjectChatGPT-4 omni
dc.subjectGemini 1.5 Pro
dc.subjecte-learning
dc.subjectlarge language model
dc.subjectmedical education
dc.subjectophthalmology
dc.titlePerformance of ChatGPT-4 Omni and Gemini 1.5 Pro on Ophthalmology-Related Questions in the Turkish Medical Specialty Exam
dc.typeArticle
dspace.entity.typePubmed
person.identifier.orcid0000-0002-0958-9961
person.identifier.orcid0000-0001-5248-5562

Files