Is ChatGPT-4 accurate and complete when answering questions on tuberculosis? Results of the ChatGTB study

Infectious Diseases and Tropical Medicine 2025; 11 : e1766
DOI: 10.32113/idtm_202510_1766

Topic: Tuberculosis Category: Original article

Unit of Infectious Diseases, Department of Medicine, Surgery, and Pharmacy, University of Sassari, Sassari, Italy
andreadevitoaho@gmail.com Close

, Colpani A.

Unit of Infectious Diseases, Department of Medicine, Surgery, and Pharmacy, University of Sassari, Sassari, Italy Close

, Buonsenso D.

Department of Woman and Child Health and Public Health, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy Close

, Candoli P. M.M.

FIMMG Piemonte, RIMeG Research Group, Turin, Italy Close

, Falbo E.

Stop TB Italy, Milan, Italy Close

, La Fauci S.

FIMMG Piemonte, RIMeG Research Group, Turin, Italy Close

, Madeddu G.

Unit of Infectious Diseases, Department of Medicine, Surgery, and Pharmacy, University of Sassari, Sassari, Italy Close

, Masini T.

Stop TB Italy, Milan, Italy Close

, Misiano G.

FIMMG Piemonte, RIMeG Research Group, Turin, Italy Close

, Monari C.

Stop TB Italy, Milan, Italy Close

, Pontarelli A.

Stop TB Italy, Milan, Italy Close

, Riccardi N.

Stop TB Italy, Milan, Italy Close

, Saderi L.

Stop TB Italy, Milan, Italy Close

, Saluzzo F.

Stop TB Italy, Milan, Italy Close

, Sotgiu G.

Stop TB Italy, Milan, Italy Close

, Tadolini M.

Stop TB Italy, Milan, Italy Close

, Besozzi G.

Stop TB Italy, Milan, Italy Close

, Calcagno A.

Stop TB Italy, Milan, Italy
andrea.calcagno@unito.it Close

Abstract

Objective: Artificial intelligence (AI), particularly large language models like ChatGPT, offers the potential to disseminate health information. This study aimed to assess the accuracy and completeness of ChatGPT-4’s responses to TB-related questions.

Materials and Methods: Ninety English-language TB questions based on official guidelines and clinical experience were formulated. ChatGPT-4o provided answers to these questions between February 1 and March 1, 2024. Three evaluation subgroups assessed the responses for accuracy (using a six-point Likert scale) and completeness (using a three-point Likert scale). Statistical analyses were performed using non-parametric tests.

Results: The median accuracy score was 5 out of 6, with 88.9% of responses scoring at least 5, indicating high overall accuracy. However, only 34.4% achieved the highest score of 6, with diminished performance on medium and high level of expertise (LOE) questions. Low LOE questions had the highest accuracy, with 63.3% scoring 6. Completeness scores showed that 48.9% of responses were comprehensive (score of 3), particularly for low LOE questions (70% scored 3). In contrast, only 23.3% of high LOE questions achieved the highest completeness score. ChatGPT-4 often lacked specificity in complex topics, such as drug-resistant TB therapies, and provided outdated information not aligned with current World Health Organization guidelines.

Conclusions: ChatGPT-4 effectively delivers accurate and comprehensive information for general TB inquiries, making it a valuable resource for the public and non-specialist clinicians. However, its performance declines with increasing question complexity, limiting its utility for advanced clinical decision-making in TB care. Continuous updates and enhancements are necessary to improve its accuracy and relevance in specialised medical contexts.

Free PDF Download

To cite this article

De Vito A.

Unit of Infectious Diseases, Department of Medicine, Surgery, and Pharmacy, University of Sassari, Sassari, Italy
andreadevitoaho@gmail.com Close

, Colpani A.

Unit of Infectious Diseases, Department of Medicine, Surgery, and Pharmacy, University of Sassari, Sassari, Italy Close

, Buonsenso D.

Department of Woman and Child Health and Public Health, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy Close

, Candoli P. M.M.

FIMMG Piemonte, RIMeG Research Group, Turin, Italy Close

, Falbo E.

Stop TB Italy, Milan, Italy Close

, La Fauci S.

FIMMG Piemonte, RIMeG Research Group, Turin, Italy Close

, Madeddu G.

Unit of Infectious Diseases, Department of Medicine, Surgery, and Pharmacy, University of Sassari, Sassari, Italy Close

, Masini T.

Stop TB Italy, Milan, Italy Close

, Misiano G.

FIMMG Piemonte, RIMeG Research Group, Turin, Italy Close

, Monari C.

Stop TB Italy, Milan, Italy Close

, Pontarelli A.

Stop TB Italy, Milan, Italy Close

, Riccardi N.

Stop TB Italy, Milan, Italy Close

, Saderi L.

Stop TB Italy, Milan, Italy Close

, Saluzzo F.

Stop TB Italy, Milan, Italy Close

, Sotgiu G.

Stop TB Italy, Milan, Italy Close

, Tadolini M.

Stop TB Italy, Milan, Italy Close

, Besozzi G.

Stop TB Italy, Milan, Italy Close

, Calcagno A.

Stop TB Italy, Milan, Italy
andrea.calcagno@unito.it Close

Is ChatGPT-4 accurate and complete when answering questions on tuberculosis? Results of the ChatGTB study

Infectious Diseases and Tropical Medicine 2025; 11 : e1766
DOI: 10.32113/idtm_202510_1766

Publication History

Submission date: 08 Aug 2025

Revised on: 25 Aug 2025

Accepted on: 06 Oct 2025

Published online: 15 Oct 2025

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.