Experts weigh in on efforts to realize incremental gains in using AI to ease systemic ills, such as speeding up study enrollment and reducing the risk for unusable flawed data.
If artificial intelligence (AI) is the panacea destined to cure the systemic ills of the clinical trial industry, only time will prove that to be true. At the moment, in the clinical trial space, it’s being used to ensure data quality, such as identifying an X-ray not properly taken of a trial patient’s lung or liver, or making sure there are no patient identifiers on any images.
The analogy of a baby clutching the coffee table before taking her first real step comes to mind: The baby doesn’t want to risk a fall, and neither does a trial sponsor. Diving head-first into a technology whose inner workings are sometimes not obvious, has hallucination issues, possesses inherent bias problems, and, therefore, must be under constant evaluation,1 is not seen as a risk-free move.
That said, there is a growing amount of research in play to see what AI can do to help those systemic ills, such as speeding up enrollment and reducing the risk for unusable flawed data.
“We’re in a bit of a moment for AI, there’s no question about it,” but regarding its use in clinical trials, “it is the beginning of a journey rather than one in which we’re already well advanced on,” says Stephen Pyke, chair of the Association of Clinical Research Organizations AI/machine learning (ML) committee, and Parexel’s AI strategy head. While ChatGPT, he says, uses a sophisticated type of predictive testing to create responses to a query, it has revealed to the clinical trials industry what is possible, it also has shown the industry how cautious it needs to be, adds Pyke. “We need to think very carefully about where it’s appropriate to use this kind of technology and in what ways.” Drug sponsors, he says, “Won’t take on any inherent risks they can’t [fully] control.”
According to Pyke, the goals right now are to realize small incremental gains, such as improvements in process automations to speed up study startup time, enable rapid document translation, and reduce regulatory submission times.
Nevertheless, the siren call of AI is being answered in various ways. Companies such as Medidata and Clario are using AI to scour their own extensive library of clinical trial data to perform certain tasks, including optimizing trial protocols (Medidata) and developing new AI models (Clario). Since 2018, Clario has offered its clients, contract research organizations (CROs), and sponsors the use of AI to enhance data accuracy, patient privacy, and operational integrity, says Todd Rudo, Clario’s chief medical officer. Its software can detect identifying characteristics of a patient and redact those images on its own, and identify, say, two transposed digits on a patient’s scan. All trial data comes into one site, regardless of the number of trial sites or the number of countries. All data is centrally reviewed, says Rudo, so Clario can reduce the variability and, therefore, improve the quality and accuracy of critical study data.
As for that baby and the coffee table? A few researchers are looking for ways to let go, safely.
Brigham and Women’s Hospital, for example, plans to use AI to scour its own electronic data warehouse to see if the technology can find the 4,500 heart disease patients already manually identified to be eligible for a clinical trial.2 A prior trial testing the AI software against a set of nearly 2000 patients, each with 120 written notes, was 98% and 100% accurate. The researchers had created a list of 13 prompts so the software could scan the patients’ medical chart data. The researchers wrote that it cost about 11 cents per patient to find those patients.3
Some researchers are experimenting with AI to find the right candidates for specific disease states, such as cancer. Rafeh Naqash, MD, assistant professor, early phase division, and director for immuno-oncology at OU Health Stephenson Cancer Center, University of Oklahoma Health Sciences, and colleagues, published a proof-of-concept study that focuses on four components—the cancer type, performance (Eastern Cooperative Oncology Group) status, genetic mutation, and measurable disease—to find appropriate candidates for Phase I cancer studies. The authors proposed the pilot project with the thought that such approaches can reduce burden on healthcare staff and find appropriate patients more quickly.4
But it took three distinct AI software programs to extract all the wanted details from patient records and the clinical trial protocols used in the search. Unstructured notes, for an oncology patient, says Naqash, are long and complicated, considering the possible and eventual number of therapies, the types of cancer, the mutations, and all the synonyms and abbreviations used for the cancers, therapies, and mutations. And these are notes from referrals, not a single-source database. “With AI, there are a lot of knowns, but there’s even more unknowns and structured data is one of the biggest,” says Naqash. Now, he says, “most of that information is … in an unstructured format.”
At this point, researchers are finding that AI is better at eliminating candidates than actually claiming candidates are eligible. “The negative predictive value is high,” says Naqash.
Another important issue—which Naqash et al. avoided because its systems were designed by the university’s computer sciences department—is that OpenAI’s GPT series is proprietary. The authors of a new Nature Communications article designed an AI product called TrialGPT, which used GPT-3.5 and GPT-4 for its work. They suggested that “Future studies should explore using and fine-tuning other open-source LLMs (large language models) as alternatives.”5
The TrialGPT authors, Jin et al., designed three AI modules to retrieve trials based on the patient’s data; to predict patient eligibility by criterion; then generate trial-level scoring based on the patient’s eligibility.
Naqash and the Jin paper each reported significant time savings in matching patients and trials, roughly half the time it takes manually.
The FDA likes the concept of AI, but has written it is not prepared to issue guidance on its use in clinical trials. It doesn’t have enough staff immersed in its details to sufficiently answer sponsor questions, and besides, it isn’t interested in piecemeal regulation. In JAMA, in September, the agency said “A lifecycle management approach incorporating recurrent local postmarket performance monitoring should be central to health AI development. Special mechanisms to evaluate large language models and their uses are needed.”
Let alone keep up with the rapid changes in AI designs. “Sponsors need to be transparent about and regulators need proficiency in evaluating the use of AI in premarket development.” The FDA added.
The human factor, its positives and negatives, were stressed by those interviewed and in research papers alike. No one is being displaced by the use of AI. On the contrary.
“It is all about how you trained the AIalgorithm,” says Rudo. “It isn’t just how much; it is the makeup of that data and it has to be the right quantity. As it turns out, it is more about diversity of the training dataset, and the engineering principles are just as important in ensuring the trustworthiness of the algorithm to produce the right results. You need data to train and also to validate the performance. With the right scientific input, accuracy can be boosted significantly when you transition from development to testing. You feed it lots of data to see how often it gets it right.
And then there is the bias issue. “You have to move quite slowly because it’s built on a technology that is almost certainly a biased data set, because it’s not a transparent [process],” says Pyke.
There is the sharing of successes and failures as well. “The ongoing limited availability of basic results on ClinicalTrials.gov contrasts with this field’s rapid advancements and the public registry’s role in reducing publication and outcome reporting biases,” wrote Maru et al.6 This international group of researchers (Tokyo, Brisbane, North Carolina) looked at 3,106 AI/ML studies registered on ClinicalTrials.gov between 2010-2023: just 5.6% of completed studies reported results.
The biggest risks that jeopardize the desired endpoint performance in an AI tool, says Rudo, is either not including scientific input to the training dataset and process or not including sufficient human oversight when the tool is deployed; not enough people reviewing the quality of the data that goes into the final dataset; and an over-reliance on AI itself. “The real value in AI is finding innovative ways to help people do their jobs more effectively,” he says.
Christine Bahls is a Freelance Writer for Medical, Clinical Trials, and Pharma Information
References
1. Warraich, H.J.; Tazbaz, T.; Califf, R.M. FDA Perspective on the Regulation of Artificial Intelligence in Health Care and Biomedicine. JAMA. Published online October 15, 2024. https://jamanetwork.com/journals/jama/fullarticle/2825146
2. Manual Versus AI-Assisted Clinical Trial Screening Using Large-Language Models (MAPS-LLM). Brigham and Women’s Hospital, ClinicalTrials.gov. https://ctv.veeva.com/study/manual-versus-ai-assisted-clinical-trial-screening-using-large-language-models
3. Hale, C. Study Shows Generative AI can Speed Up Clinical Trial Enrollment for Pennies Per Patient. Fierce Biotech. June 21, 2024. https://www.fiercebiotech.com/medtech/study-shows-generative-ai-can-speed-clinical-trial-enrollment-pennies-patient
4. Ghosh, S.; Abushukair, H.M.; Ganesan, A.; Pan, C.; Naqash, A.R.; Lu, K. Harnessing Explainable Artificial Intelligence for Patient-to-Clinical-Trial Matching: A Proof-of-Concept Pilot Study Using Phase I Oncology Trials. Plos One. 2024. 19 (10), e0311510. https://pubmed.ncbi.nlm.nih.gov/39446771/
5. Jin, Q.; Wang, Z.; Floudas, C.S; et al. Matching Patients to Clinical Trials with Large Language Models. Nat Commun. 2024. 15, 9074. https://doi.org/10.1038/s41467-024-53081-z
6. Maru, S.; Matthias, M.D.; Kuwatsuru, R.; Simpson Jr, R.J. Studies of Artificial Intelligence/Machine Learning Registered on ClinicalTrials.gov: Cross-Sectional Study With Temporal Trends, 2010-2023. J Med Internet Res. 2024. 26, e57750. https://www.jmir.org/2024/1/e57750