Web of Science: Performance of ChatGPT-4 Omni and Gemini 1.5 Pro on Ophthalmology-Related Questions in the Turkish Medical Specialty Exam
| dc.contributor.author | Sabaner, M.C. | |
| dc.contributor.author | Yozgat, Z. | |
| dc.date.accessioned | 2025-09-01T06:18:01Z | |
| dc.date.issued | 2025.01.01 | |
| dc.description.abstract | Objectives: To evaluate the response and interpretative capabilities of two pioneering artificial intelligence (AI)-based large language model (LLM) platforms in addressing ophthalmology-related multiple-choice questions (MCQs) from Turkish Medical Specialty Exams. Materials and Methods: MCQs from a total of 37 exams held between 2006-2024 were reviewed. Ophthalmology-related questions were identified and categorized into sections. The selected questions were asked to the ChatGPT-4o and Gemini 1.5 Pro AI-based LLM chatbots in both Turkish and English with specific prompts, then re-asked without any interaction. In the final step, feedback for incorrect responses were generated and all questions were posed a third time. Results: A total of 220 ophthalmology-related questions out of 7312 MCQs were evaluated using both AI-based LLMs. A mean of 6.47 +/- 2.91 (range: 2-13) MCQs was taken from each of the 33 parts (32 full exams and the pooled 10% of exams shared between 2022 and 2024). After the final step, ChatGPT-4o achieved higher accuracy in both Turkish (97.3%) and English (97.7%) compared to Gemini 1.5 Pro (94.1% and 93.2%, respectively), with a statistically significant difference in English (p=0.039) but not in Turkish (p=0.159). There was no statistically significant difference in either the inter-AI comparison of sections or interlingual comparison. Conclusion: While both AI platforms demonstrated robust performance in addressing ophthalmology-related MCQs, ChatGPT-4o was slightly superior. These models have the potential to enhance ophthalmological medical education, not only by accurately selecting the answers to MCQs but also by providing detailed explanations. | |
| dc.identifier.doi | 10.4274/tjo.galenos.2025.27895 | |
| dc.identifier.eissn | 2147-2661 | |
| dc.identifier.endpage | 185 | |
| dc.identifier.issn | 1300-0659 | |
| dc.identifier.issue | 4 | |
| dc.identifier.startpage | 177 | |
| dc.identifier.uri | https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=dspace_ku&SrcAuth=WosAPI&KeyUT=WOS:001555465700001&DestLinkType=FullRecord&DestApp=WOS_CPL | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12597/34895 | |
| dc.identifier.volume | 55 | |
| dc.identifier.wos | 001555465700001 | |
| dc.language.iso | en | |
| dc.relation.ispartof | TURK OFTALMOLOJI DERGISI-TURKISH JOURNAL OF OPHTHALMOLOGY | |
| dc.rights | info:eu-repo/semantics/openAccess | |
| dc.subject | Artificial intelligence | |
| dc.subject | large language model | |
| dc.subject | ChatGPT-4 Omni | |
| dc.subject | Gemini 1.5 Pro | |
| dc.subject | medical education | |
| dc.subject | ophthalmology | |
| dc.subject | e-learning | |
| dc.title | Performance of ChatGPT-4 Omni and Gemini 1.5 Pro on Ophthalmology-Related Questions in the Turkish Medical Specialty Exam | |
| dc.type | Article | |
| dspace.entity.type | Wos |
