To have a successful integration of data from across clinical systems, a strong understanding of the data is a critical.
Clinical trials build a profile of a compound to answer: Is it safe? Is it effective? Is it better than other offerings on the market? The cost of gaining this knowledge is high, so maximizing the utilization of the knowledge collected through the trial is essential.
Individual systems or a few integrated systems provide a reasonable view of the data collected within a small domain, but growing organizations need to bridge data from multiple systems to support analysis across the clinical operation.
Data used to drive operational decisions is generated by a variety of contributing areas, including biostats, essential documents, clinical supplies, patient reported outcomes, protocols, disease prevalence and genetics, to name only a few. The list of data sources expands every day. Integrating data sets from different clinical systems opens paths to pursue answers for difficult questions, previously impossible to answer using one or two linked systems.
Clinical goals change rapidly, as clinical teams work to improve and save lives. Limits imposed by a lack of data inhibits the chances for success. In addition to assessing past and current progress, large, integrated sets of data are the essential to activate the power of Artificial Intelligence (AI) and Machine Learning (ML) tools to better project outcomes.
Integrating data from across clinical systems requires investment. So, the question becomes, how much integration is needed and how big is the investment? As you choose the integration options, building a strong understanding of the data is a critical, foundational first step. Using your integrated data successfully depends on this first step.
Managing data across multiple tools requires an understanding of the data connection points. Looking at an example, you may want to observe trends with patient data quality, across studies, for a particular clinical site. To do this you need to look at potentially one system for study sites and locations, and possibly a different system for information on the organization. Next you gather contract information and find the site data quality and timeliness information in a different system.
Doing this task can be difficult. You need to be certain that the site information matches correctly across systems and they have the same organization, the same physical location and the same department.
Across systems or within systems, data may be inconsistent. You may see something as common as multiple studies using the same physical site, although they have different site numbers in each study. Having different names or a site, or errors in data entry such as multiple versions of organization names, can make compilation of a full picture nearly impossible.
When you successfully associate your data across systems, the value of your data greatly increases; you create an opportunity to track site performance without investing in new site tracking systems. Getting an understanding of your data, within and across systems, is the single, most important factor in making your data aggregation work. Resources invested in standardizing data reap huge benefits.
Determining the definition of the data (e.g., study start date: What event is the starting point for a study?) and the relationships between pieces of information (e.g., do clinical studies have one or more indications; no indication?) allows you to use collected system data to address a larger number of business questions. As more integrated system data is available from both within and outside of the organization, new business insights can be harvested.
As an example, consider the potential number of available patients fitting the clinical study protocol requirements within a reasonable radius of a site. When considering funding and expected study completion timelines, this is a reasonable question to ask. An analysis of this question would consider the protocol requirements, demographics of the local population, the transportation opportunities for the target population and the number of competing specialists in the area around the clinical site. Working with individual systems and one-off queries of external data sets, this task becomes time consuming and error prone. Combining data from confirmed internal and external systems makes this an easier question to address.
The task to define and test the combination of data requires resource commitment. How much will you need to start? Consider the areas of greatest need and risk, which provide an obvious avenue to begin the data integration journey. The risks you identify, such as data integrity needs across systems or data sets with limited availability, will inform choices between options for integration.
Different price points and levels of flexibility exist, depending on the scope and functionality required for a data integration. In many cases, a single organization may utilize more than one of the approaches described below.
Leveraging data from sources both within and outside the enterprise may provide new and interesting opportunities for investment, resource planning, study planning and knowledge reuse—all essential for making the greatest use of information collected in and around clinical trials.
Teresa Montes is a Clinical Practice Lead; Emma DiBella is a Clinical Consultant; both for Daelight Solutions.