How Useful Are Large Language Models for Caregivers of Pediatric Cancer Patients?

How Useful Are Large Language Models for Caregivers of Pediatric Cancer Patients? https://pediatricsnationwide.org/wp-content/uploads/2025/09/AdobeStock_1325868254-copy-for-web-1024x502.jpg 1024 502 JoAnna Pendergrass, DVM JoAnna Pendergrass, DVM https://pediatricsnationwide.org/wp-content/uploads/2021/03/pendergrass_01.jpg September 15, 2025 September 12, 2025

September 15, 2025
JoAnna Pendergrass, DVM

These powerful informational tools for caregivers of pediatric cancer patients vary in areas such as readability and source credibility, highlighting the need to carefully consider their clinical utility.

A recent study led by Emre Sezgin, PhD, and Micah Skeens, PhD, APRN, FAAN, CPNP-PC, at Nationwide Children’s Hospital demonstrated that large language models (LLMs) deliver accurate and clinically relevant information for caregivers of pediatric cancer patients but vary in other factors, such as readability.

In their study, published in Cancer Medicine, Dr. Sezgin, Dr. Skeens and their research teams evaluated four popular LLMs for their use in pediatric oncology: ChatGPT, Google Bard/Gemini, Google SGE and Microsoft Bing Chat.

“LLMs can serve as a bridge between complex clinical knowledge and accessible education for caregivers,” says Dr. Sezgin, a digital health specialist and principal investigator in the Center for Biobehavioral Health at Nationwide Children’s.

Conventional methods of pediatric oncology caregiver education are primarily paper-based, explains Dr. Skeens, a pediatric nurse practitioner and principal investigator in the Center for Biobehavioral Health.

“Large binders of information, for example, are overwhelming, time-intensive and not easily searchable,” she says.

They first compiled a set of 26 frequently asked questions (FAQs) that reflected a pediatric cancer caregiver’s perspective. Next, they selected five clinical pediatric oncology experts to evaluate LLM-generated responses for their accuracy, clarity, inclusivity, completeness and clinical utility. Each FAQ was entered into each LLM, generating 104 responses.

The research team evaluated the content quality of responses, measuring readability, artificial intelligence (AI) disclosures, source credibility, resource matching and content originality.

ChatGPT had the highest overall rating, with statistically significantly higher scores for accuracy, clarity, completeness and clinical utility, compared to the other LLMs.

Content quality scores varied between the LLMs. For example, Google Bard received a high score in AI disclosure, while Microsoft Bing Chat scored highly for source credibility and resource matching.

None of the LLMs met the gold standard for readability (8th grade level), says Dr. Sezgin, highlighting a need to improve the LLMs’ understandability.

“These results highlight the need for careful and thoughtful selection of which LLMs to use, and the need to refine their clinical use,” Dr. Skeens notes.

Drs. Sezgin and Skeens mention several provider concerns about LLMs in the clinical setting, including the potential for misinformation, general content inaccuracies and insufficient source credibility. They also note the need to investigate real-time caregiver interaction with LLMs.

“We need to understand caregivers’ impression and perception of LLMs,” Dr. Skeens explains. She also recommends considering digital literacy to ensure that caregivers know how to use LLMs and can carefully evaluate the generated responses.

“We’re at a pivotal moment where AI can reshape how we support caregivers, but the technology should not outpace trust,” Dr. Sezgin says, adding, “This work is not about replacing human connection, but augmenting it.”

“This article appeared in the 2025 Fall/Winter print issue. Download the issue here.”

Reference:

Sezgin E, Jackson DI, Kocaballi AB, Bibart M, Zupanec S, Landier W, Audino A, Ranalli M, Skeens M. Can large language models aid caregivers of pediatric cancer patients in information seeking? A cross-sectional investigation. Cancer Medicine. 2025 Jan;14(1):e70554.

Image credit: Adobe Stock

About the author

JoAnna Pendergrass, DVM

JoAnna Pendergrass, DVM, is a veterinarian and freelance medical writer in Atlanta, GA. She received her veterinary degree from the Virginia-Maryland College of Veterinary Medicine and completed a 2-year postdoctoral research fellowship at Emory University’s Yerkes Primate Research Center before beginning her career as a medical writer.

As a freelance medical writer, Dr. Pendergrass focuses on pet owner education and health journalism. She is a member of the American Medical Writers Association and has served as secretary and president of AMWA’s Southeast chapter.

In her spare time, Dr. Pendergrass enjoys baking, running, and playing the viola in a local community orchestra.