
New Dataset Makes Health Chatbots Like Google's MedGemma More Mindful of African Contexts
A groundbreaking new medical dataset is poised to revolutionize healthcare in Africa by improving chatbots’ understanding of the continent’s most pressing medical issues and increasing their awareness of accessible treatment options.
AfriMed-QA, developed by researchers from Georgia Tech and Google, could reduce the burden on African healthcare systems.
The researchers said people in need of medical care file into overcrowded clinics and hospitals and face excruciatingly long waits with no guarantee of admission or quality treatment. There aren’t enough trained healthcare professionals available to meet the demand.
Some healthcare question-answer chatbots have been introduced to treat those in need. However, the researchers said there’s no transparent or standardized way to test or verify their effectiveness and safety.
The dataset will enable technologists and researchers to develop more robust and accessible healthcare chatbots tailored to the unique experiences and challenges of Africa.
One such new tool is Google’s MedGemma, a large-language model (LLM) designed to process medical text and images. AfriMed-QA was used for training and evaluation purposes.
AfriMed-QA stands as the most extensive dataset that evaluates LLM capabilities across various facets of African healthcare. It contains 15,000 question-answer pairs culled from over 60 medical schools across 16 countries and covering numerous medical specialties, disease conditions, and geographical challenges.
Tobi Olatunji and Charles Nimo co-developed AfriMed-QA and co-authored a paper about the dataset that will be presented at the Association for Computational Linguistics (ACL) conference next week in Vienna.
Olatunji is a graduate of Georgia Tech’s Online Master of Science in Computer Science (OMSCS) program and holds a Doctor of Medicine from the College of Medicine at the University of Ibadan in Nigeria. Nimo is a Ph.D. student in Tech’s School of Interactive Computing, where he is advised by School of IC professors Michael Best and Irfan Essa.
Focus on Africa
Nimo, Olatunji, and their collaborators created AfriMed-QA as a response to MedQA, a large-scale question-answer dataset that tests the medical proficiency of all major LLMs. That includes Google’s Gemini, OpenAI’s ChatGPT, and Anthropic’s Claude, among others.
However, because MedQA is trained solely on the U.S. Medical License Exams, Nimo said it is not adequate to serve patients in underdeveloped African countries nor the Global South at-large.
“AfriMed-QA has the contextualized and localized understanding of African medical institutions that you don’t get from Med-QA,” Nimo said. “There are specific diseases and local challenges in our dataset that you wouldn't find in any U.S.-based dataset.”
Olatunji said one problem African users may encounter using LLMs trained on MedQA is that they may advise unfeasible treatments or unaffordable prescription drugs.

“You consider the types of drugs, diagnostics, procedures, or therapies that exist in the U.S. that are quite advanced. These treatments are much more accessible, for example in the US, and Europe,” Olatunji said. “But in Africa, they’re too expensive and many times unavailable. They may cost over $100,000, and many people have no health insurance. Why recommend such treatments to someone who can’t obtain them?”
Another problem may be that the LLM doesn’t take a medical condition seriously if it isn’t predominant in the U.S.
“We tested many of these models, for example, on how they would manage sickle-cell disease signs and symptoms, and they focused on other “more likely” causes and did not rank or consider sickle cell high enough as a possible cause,” he said. “They, for example, don’t consider sickle-cell as important as anemia and cancer because sickle-cell is less prevalent in the U.S.”
In addition to sickle-cell disease, Olatunji said some of the healthcare issues facing Africa that can be improved through AfriMed-QA include:
- HIV treatment and prevention
- Poor maternal healthcare
- Widespread malaria cases
- Physician shortage
- Clinician productivity and operational efficiency
Google Partnership
Mercy Asiedu, senior author of the AfriMed-QA paper and research scientist at Google Research, has dedicated her career to improving healthcare in Africa. Her work began as a Ph.D. student at Duke University, where she invented the Callascope, a groundbreaking non-invasive tool for gynecological examinations
With her current focus on democratizing healthcare through artificial intelligence (AI), Asiedu, who is from Ghana, helped create a research consortium to develop the dataset. The consortium consists of Georgia Tech, Google, Intron, Bio-RAMP Research Labs, the University of Cape Coast, the Federation of African Medical Students Association, and Sisonkebiotik.
Sisonkebiotik is an organization of researchers that drives healthcare initiatives to advance data science, machine learning, and AI in Africa.
Olatunji leads the Bio-RAMP Research Lab, a community of healthcare and AI researchers, and he is the founder and CEO of Intron, which develops natural-language processing technologies for African communities.
In May, Google released MedGemma, which uses both the MedQA and Afri-MedQA datasets to form a more globally accessible healthcare chatbot. MedGemma has several versions, including 4-billion and 27-billion parameter models, which support multimodal inputs that combine images and text.
“We are proud the latest medical-focused LLM from Google, MedGemma, leverages AfriMed-QA and improves performance in African contexts,” Asiedu said.
“We started by asking how we could reduce the burden on Africa’s healthcare systems. If we can get these large-language models to be as good as experts and make them more localized with geo-contextualization, then there’s the potential to task-shift to that.”
The project is supported by the Gates Foundation and PATH, a nonprofit that improves healthcare in developing countries.
As computing revolutionizes research in science and engineering disciplines and drives industry innovation, Georgia Tech leads the way, ranking as a top-tier destination for undergraduate computer science (CS) education. Read more about the college's commitment:… https://t.co/9e5udNwuuD pic.twitter.com/MZ6KU9gpF3
— Georgia Tech Computing (@gtcomputing) September 24, 2024