Scalability: Solving the Unique Clinical Trial Data Problem

Scalability: Solving the Unique Clinical Trial Data Problem

March 1, 2020

Article

Applied Clinical Trials

Applied Clinical TrialsApplied Clinical Trials-03-01-2020

Volume 29

Issue 3

Finding a middle-ground approach to balancing new solutions arising from data science with traditional requirements for data collection and submission.

There is a problem in clinical trial data structures. It isn’t a problem for any individual trial, or even a few. But, for those of us who deal with tens or hundreds or even thousands of trials worth of data, it is a pretty serious hurdle. It’s called scalability, and for the clinical trials industry, it presents a unique issue.

The reason for this is the expectation in clinical trials that data be maintained in a separate compartmentalized database. This is considered good clinical practice (GCP). Decades ago, when this practice was first introduced, it set a series of goals that were admirable, expected, and not difficult to follow. However, as the years went on, the rules were not updated. Much like trying to drive a modern car on streets designed for horse and carriage, what was once a good fit for the industry has become somewhat of a hindrance.

Simply put, a lot has changed technologically since the original principles were put in place decades ago. The tools we have at our disposal are on a level that the original guidelines could never have predicted. This means that the expectations for the use of data in general, not only in clinical trials, couldn’t have been anticipated.

Click to enlarge

So, how do we marry the expectations and solutions we have available to us from a data science perspective with requirements arising from these traditional viewpoints? Well, as an industry, we must recognize the limitations of the physical models we have in place currently. Then, we must talk about the possibilities that are available to us. Fortunately, other industries have solved these problems and are making advancements. So, while we are certainly lagging behind, there is a clear light at the end of the tunnel.

In order to begin advancing, we must first understand the exact problem. As mentioned, the problem that we must solve is that all clinical trial databases must be kept separate from each other. The debate is between physical separation and logical separation.

Logical separation provides the ability to be in one electronic database but to utilize filters to isolate the clinical trial database that is relevant for your analysis. Physical separation puts major hurdles around analysis and automation, while providing an additional layer of security and integrity protection. Both strategies are valid, but only one of them is scalable operationally, and allows for automation of data science, analysis, and monitoring.

Physical separation provides major technological hurdles. This means that data normalization and standardization can’t be enforced with integrity controls. Physical separation requires separate user identification management and controls at the schema level of the database and adds a layer of overhead to the implementation that can be cost prohibitive when implementing reporting and operational solutions at scale (see Figure 1 above).

Logical separation also affords the ability to undergo development to a “comprehensive” reporting and data science solution. While logical separation has risks associated with security and access control, those can be mitigated with access control groups and even implementing security controls at the record level of the schema.

Click to enlarge

A nice middle ground to this is the apartment model. Here, the clinical submission data flow path works in a physical separation “apartment” model while having a system that isn’t on the critical path of collection and submission house data in a warehouse or data lake for scalable reporting, operational monitoring, and aggregated data analysis, and aggregated data science statistical model development (see Figure 2).

This allows data collections to meet all previous regulatory concern requirements, get the most out of modern data architecture and data science solutions, and does so without ever compromising GCP principles.

If one wants to leverage image analysis, classification, machine learning, deep learning, or any of the other potentially groundbreaking technologies that are available on specific data, the data must be in a structure that normalizes and standardizes that data.

In other words, in order to get the most out of technology, while also staying within GCP, one will need to marry these two architecture solutions together and develop a comprehensive answer to these problems.

As we move further into the decade, we can expect to see some groundbreaking and new ways to use technology, both in our personal lives and in clinical trials. The question for us as an industry, however, is will we be ready when those advancements arrive? Or, will we still be stuck driving supercars on top of cobblestone?

Keith Aumiller, Senior Director, Data Services, Signant Health

Download Issue PDF

Articles in this issue

Screen Shot 2020-03-10 at 12.07.54 PM.png

Investing in Data Science to Unlock Clinical Data Value

Screen Shot 2020-03-10 at 12.07.54 PM-1.png

Clinical Data Manager: A Roadmap for the Future

Applied Clinical Trials, March 2020 Issue (PDF)

The Importance of Site Selection

Three Key Takeaways from SCOPE

Scalability: Solving the Unique Clinical Trial Data Problem

Data Management Landscape is Still Changing, Are We Adapting Fast Enough?

Oncology Therapies Poised for Further Gains at FDA

EU Specialists: Ease the Age Barrier for Cancer Clinical Trials

News Notes

Untapped Opportunity to Improve the Vendor Qualification Process

The Practice and Promise of Data Science in Clinical Trials

Do-it-Yourself Blood Sampling for Pediatric Clinical Trials

Related Content

The 50-Year Technology Drought in Clinical Trials: How AI Teammates Will Finally Bring Us Into the Modern Era

Gaurav Bhatnagar

July 11th 2025

Article

Why a fundamental reimagining of how clinical studies operate is still necessary to achieve a true paradigm shift—and shed the cycle of reliance on incremental gains.

Unifying Industry to Better Understand GCP Guidance

Andy Studna, Senior Editor

May 7th 2025

Podcast

In this episode of the Applied Clinical Trials Podcast, David Nickerson, head of clinical quality management at EMD Serono; and Arlene Lee, director of product management, data quality & risk management solutions at Medidata, discuss the newest ICH E6(R3) GCP guidelines as well as how TransCelerate and ACRO have partnered to help stakeholders better acclimate to these guidelines.

© Matthieu - © Matthieu - stock.adobe.com

Arcus’ Quemliclustat Earns Orphan Drug Designation as Phase III Pancreatic Cancer Trial Advances

Andy Studna, Senior Editor

July 11th 2025

Article

The FDA has granted orphan status to Arcus Biosciences’ CD73 inhibitor quemliclustat for metastatic pancreatic cancer, as the global PRISM-1 Phase III trial nears full enrollment following promising survival data from ARC-8.

Unlock Commercial Growth through Data-Driven Patient and HCP Insights Podcast

May 2nd 2025

Podcast

EMBARK Trial Shows Xtandi Combination Demonstrates Significant Overall Response in Non-Metastatic Hormone-Sensitive Prostate Cancer

Don Tracy, Associate Editor

July 10th 2025

Article

Results from the Phase III EMBARK trial show that in combination with leuprolide, Xtandi (enzalutamide) demonstrated a statistically significant and clinically meaningful overall survival benefit in men with non-metastatic hormone-sensitive prostate cancer and high-risk biochemical recurrence.

© Proxima Studio - © Proxima Studio - stock.adobe.com

QWINT-1 Trial: Once-Weekly Efsitora Matches Daily Glargine in Type 2 Diabetes Management

Andy Studna, Senior Editor

July 10th 2025

Article

Results from the Phase III QWINT-1 trial show that Eli Lilly’s once-weekly insulin efsitora is noninferior to once-daily glargine in reducing HbA1c among insulin-naïve adults with type 2 diabetes, offering a simplified fixed-dose regimen with fewer hypoglycemic events and less treatment burden.