Share your thoughts about the Research Communities in our survey.

We Need More Realistic Benchmarks for AI Models in Medicine

Large Language Models have shown impressive capabilities, but their medical knowledge has so far only been tested on medical licensing exams. We find that on more realistic tasks, such as clinical decision making, they lag behind medical experts, highlighting the need for more realistic benchmarks.

Paul Hager and 1 other

Jul 29, 2024

Topics

Channels contributed to:

Behind the Paper

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

Paul Hager

Popular Content

We Need More Realistic Benchmarks for AI Models in Medicine

Topics

Channels contributed to:

Cookies