Project EB-LLM | Harding-Zentrum für Risikokompetenz

This project contains a two-phase randomized evaluation of large language models in adherence to evidence-based health communication guidelines for breast and prostate cancer screening.

This study investigates whether large language models, including Open AI's ChatGPT, Google Gemini, and Mistral AI's Le Chat, can provide accurate and understandable information on breast and prostate cancer screening. The goal is to assess whether their responses adhere to evidence-based medical standards and how the quality of these responses can be improved through more detailed user input. The study has two phases: In Phase 1, the language models answer predefined questions to evaluate the quality of their responses. In Phase 2, real users submit their own questions, with some receiving additional guidance to improve the specificity of their input. The study also examines whether minimal user guidance enhances the quality of the models' responses. The overarching aim is to determine whether language models can help individuals make informed health decisions, particularly regarding cancer screening.