Login
Search
Search
0 Dates
2026
2025
2024
2023
2022
2021
2020
2019
2018
0 Events
CPC 2018
CPC 2019
Curso de Atualização em Medicina Cardiovascular 2019
Reunião Anual Conjunta dos Grupos de Estudo de Cirurgia Cardíaca, Doenças Valvulares e Ecocardiografia da SPC
CPC 2020
CPC 2021
CPC 2022
CPC 2023
CPC 2024
CPC 2025
CPC 2026
0 Topics
A. Basics
B. Imaging
C. Arrhythmias and Device Therapy
D. Heart Failure
E. Coronary Artery Disease, Acute Coronary Syndromes, Acute Cardiac Care
F. Valvular, Myocardial, Pericardial, Pulmonary, Congenital Heart Disease
G. Aortic Disease, Peripheral Vascular Disease, Stroke
H. Interventional Cardiology and Cardiovascular Surgery
I. Hypertension
J. Preventive Cardiology
K. Cardiovascular Disease In Special Populations
L. Cardiovascular Pharmacology
M. Cardiovascular Nursing
N. E-Cardiology / Digital Health, Public Health, Health Economics, Research Methodology
O. Basic Science
P. Other
0 Themes
01. History of Cardiology
02. Clinical Skills
03. Imaging
04. Arrhythmias, General
05. Atrial Fibrillation
06. Supraventricular Tachycardia (non-AF)
07. Syncope and Bradycardia
08. Ventricular Arrhythmias and Sudden Cardiac Death (SCD)
09. Device Therapy
10. Chronic Heart Failure
11. Acute Heart Failure
12. Coronary Artery Disease (Chronic)
13. Acute Coronary Syndromes
14. Acute Cardiac Care
15. Valvular Heart Disease
16. Infective Endocarditis
17. Myocardial Disease
18. Pericardial Disease
19. Tumors of the Heart
20. Congenital Heart Disease and Pediatric Cardiology
21. Pulmonary Circulation, Pulmonary Embolism, Right Heart Failure
22. Aortic Disease
23. Peripheral Vascular and Cerebrovascular Disease
24. Stroke
25. Interventional Cardiology
26. Cardiovascular Surgery
27. Hypertension
28. Risk Factors and Prevention
29. Rehabilitation and Sports Cardiology
30. Cardiovascular Disease in Special Populations
31. Pharmacology and Pharmacotherapy
32. Cardiovascular Nursing
33. e-Cardiology / Digital Health
34. Public Health and Health Economics
35. Research Methodology
36. Basic Science
37. Miscellanea
0 Resources
Abstract
Slides
Vídeo
Report
CLEAR FILTERS
Large Language Models Versus Physicians in Cardiovascular Risk Stratification
Session:
Sessão de Comunicações Orais 09 – Inteligência Artificial e tomada de decisão no risco cardiovascular e nos sistemas de saúde
Speaker:
José Ferreira Santos
Congress:
CPC 2026
Topic:
N. E-Cardiology / Digital Health, Public Health, Health Economics, Research Methodology
Theme:
33. e-Cardiology / Digital Health
Subtheme:
33.4 Digital Health
Session Type:
Comunicações Orais
FP Number:
---
Authors:
Jose Ferreira Santos; Regina de Brito Duarte; Inês Mota; Rita Carvalheira Santos; Jose Maria Moreira; Joana Campos; Nuno André Silva; Bernardo Neves; Ricardo Ladeiras-Lopes; Francisca Leite; Helder Dores
Abstract
<p style="text-align:start"><span style="font-size:medium"><span style="font-family:"Times New Roman",serif"><span style="color:#000000"><strong><span style="color:black">Background:</span></strong> <span style="color:black">Large language models (LLMs) show promise in medical reasoning, but their reliability relative to practicing clinicians remains poorly characterized. Benchmarking models against the variability of real-world clinical judgment, not only against guidelines, is essential to define their role in practice.</span></span></span></span></p> <p style="text-align:start"><span style="font-size:medium"><span style="font-family:"Times New Roman",serif"><span style="color:#000000"><strong><span style="color:black">Objectives:</span></strong> <span style="color:black">This exploratory study compared 11 contemporary LLMs with a diverse cohort of practicing physicians for cardiovascular risk stratification, focusing on classification accuracy, inter-rater variability, and safety-critical errors.</span></span></span></span></p> <p style="text-align:start"><span style="font-size:medium"><span style="font-family:"Times New Roman",serif"><span style="color:#000000"><strong><span style="color:black">Methods:</span></strong> <span style="color:black">In this vignette-based benchmark of 11 LLMs and 8 physicians, we used 30 validated synthetic clinical vignettes requiring cardiovascular risk stratification. Eight physicians (3 Family Medicine, 3 Internal Medicine, 2 Cardiology), all with >3 years’ specialty experience, independently classified vignettes into three ESC 2021 risk categories. Their performance contextualized that of 11 LLMs from six major families (GPT, Claude, Gemini, Llama, Grok, DeepSeek). Agreement with an expert-adjudicated gold standard was assessed using quadratic-weighted Cohen’s kappa (κw). Inter-rater reliability was quantified with Gwet’s AC2, and a majority-vote physician ensemble was constructed.</span></span></span></span></p> <p style="text-align:start"><span style="font-size:medium"><span style="font-family:"Times New Roman",serif"><span style="color:#000000"><strong><span style="color:black">Results:</span></strong> <span style="color:black">LLM agreement with the gold standard ranged from fair to moderate (κw 0.40–0.69). Individual physicians showed wider variability, with κw 0.15–0.93. Inter-rater reliability among physicians was moderate (AC2=0.44). The top-performing model, GPT-4o (κw=0.69), outperformed 7 of 8 individual physicians, but the pooled physician ensemble (κw=0.76) exceeded all LLMs. Error analysis revealed distinct safety profiles: physicians made no major two-level misclassifications (low/moderate vs very high risk), whereas several LLMs did so in 2–13% of cases.</span></span></span></span></p> <p style="text-align:start"><span style="font-size:medium"><span style="font-family:"Times New Roman",serif"><span style="color:#000000"><strong><span style="color:black">Conclusions:</span></strong> <span style="color:black">In this vignette-based benchmark, top-tier LLMs matched or exceeded the performance of most individual physicians but remained inferior to collective human consensus and were uniquely prone to rare yet critical two-level errors. These findings highlight both the promise and safety limitations of current LLMs and underscore the need for further validation before clinical use.</span></span></span></span></p>
Our mission: To reduce the burden of cardiovascular disease
Visit our site