The single most important need for integrating EHR and EDC is the elimination of redundant data entry.
As acceptance of both electronic health records (EHR) and electronic data capture (EDC) becomes widespread, forward thinking clinicians and researchers have begun to consider the possibility of a single system to simultaneously collect both patient care and clinical research data. Data could be entered once at the bedside onto a single computer screen, where it would be available for both patient care and clinical research databases. This would improve workflow and efficiency at clinical research sites, eliminate errors from data transcription, eliminate the costly and time-consuming process of source data verification, and speed up the overall process of clinical trials.
Photography: Comstock, Digital Stock, Jim Shive Illustration: Paul A. Belci
Since the data and the methods of collection are similar or identical in the two processes, many assume that integration would be straightforward. However, few are familiar with the intricacies of both worlds, leading to simplified assumptions as well as schemas and timelines for integration that are very ambitious or unrealistic.
This paper will discuss some of the key obstacles in bringing EHR and EDC together, and will evaluate some of the proposed architectures for implementing an integration of the two.
It is challenging to draw conclusions about EHR systems as a whole because of the very wide range of functionality in different systems and the very diverse needs of different users (see Table 1). Although some systems have many of these functions, a single system may be targeted specifically to a particular area of functionality or even a specific function. Organizationally, solo practices, small group practices, large group practices, community hospitals, major hospitals, large hospital systems, and large health maintenance organizations have very different needs for EHR systems.
Table 1. Functions of EHR systems
The EHR is much more common in European and other non-U.S. countries. In fact, physicians in countries such as Sweden and the Netherlands use the EHR for more than 80% of patient encounters.1 The lagging implementation of EHR in the United States is not because of a lack of available technology. The United States is hampered by a fragmented private health care system, where no central authority can dictate a single solution.
Furthermore, there are a plethora of potential solutions and vendors from which each autonomous practice or health care network can choose. A large institution may implement several different EHR systems, each for a different component of their need. Countries with governmentally managed health care systems, however, can require the use of particular EHR systems and can even dictate the common data dictionary and database structure to be used—allowing for a much more organized sharing of data within the system.
Of the hundreds of vendors of EHR systems, dozens have significant sales in various segments of the EHR market (hospitals, large group practices, health maintenance organizations, and small practices). In large hospital systems, a handful of dominant vendors have emerged. Recently, consolidation is beginning to create vendors whose solutions operate in several different practice settings.
Although some of the systems are relatively modern, it is very common for commercial EHR systems to be built on older, specialized software languages for systems originating in the era of mainframe computers and minicomputers. These EHR systems may not incorporate modern information technology, including Web interfaces, the use of XML, or concepts such as Web services. Unfortunately, much of the interoperability between EHR systems and the potential for EHR/EDC integration would require the use of such tools.
While clinical data managers at pharmaceutical sponsors typically think in terms of highly structured relational database tables, much of the data in EHR systems is far from this standard. EHR systems must incorporate data from many sources, including scanned text documents, transcriptions of dictations, and faxed information from outside practices. In many systems, the key data of interest to the pharmaceutical sponsors is entered and maintained in unstructured textual data—unavailable for any reasonable automatic extraction. The only reasonable way to harvest this data for clinical trial statistical analysis is by manual transcription.
Although vendor-developed EHR systems are the norm, some EHR systems in use today were either developed for in-house use at particular hospitals or they were developed as custom solutions for particular institutions. These systems may work very well when developed, but typically are inflexible when applied outside of the original setting. In growing and changing organizations, such systems may be challenging to continue to enhance and support. As newer technologies arrive, it often becomes more attractive in many organizations to adopt a commercial system where development costs are shared across all customers.
The single most important need for integrating EHR and EDC is the elimination of redundant data entry. Typically, in an EDC study the data are entered first into a medical record and then transcribed to an EDC system. At a site with paper medical records, this can often be accomplished without an intermediate step; however, with an EHR system it may be more difficult to have two computers next to one another for the data to be transcribed, or to have two windows open on the same screen. Many sites choose to create an intermediate worksheet for copying the data from the EHR and then again to the EDC system.
The result of multiple transcription steps for the research database is an increase in errors. Data managers know that anytime data is copied, a portion may change. These discrepancies reduce data quality and can also jeopardize an FDA audit—and potentially an entire New Drug Application. For the site, the process of copying the data over creates more work for the investigator and their staff along with increased costs.
As more and more physicians move to direct entry of patient data into an EHR, often with the patient in the room, the requirement to transcribe this data into a research data system is being questioned by research physicians. Why not, they ask, just extract the information from the EHR system after it is recorded as part of the visit? Alternatively, the EHR system could display an eCRF for the investigator when they are examining a patient on a protocol. The advantage of this method would be a reminder of which data need to be collected contemporaneously with the patient visit. Others, typically in pharmaceutical data management, wonder whether data collected initially into an EDC eCRF might be transferred in some form to populate an EHR database. Clearly, all of these mechanisms reduce the work an investigator must do in entering patient data and allow for a smoother workflow.
A substantial percentage of data in clinical trials is pre-existing in the EHR of an enrolled patient. For example, the demographics, past medical history, concomitant medications, and some baseline measures have been estimated to constitute 5% to 35% of the data in a trial. If it were possible to efficiently harvest this data and transfer it to an EDC or clinical data management system (CDMS), there would be a substantial savings in the cost of transcription and checking of such data.
The pharmaceutical sponsor reaps significant benefits from integrating EHR and EDC as well. Data inconsistencies arise whenever data is transcribed. This axiom is well known to data managers and clinical monitors who review such discrepancies regularly. When data are entered once and electronically transferred, these errors are eliminated. In addition, a substantial amount of clinical trial costs is in monitoring, and a substantial percentage of monitoring costs is in the mundane task of source document verification. The tedious comparison of source data with transcribed CRF data is not only resource intensive for on-site research monitors, it is also prone to errors. If source and eCRF data are the same, source document verification becomes unnecessary.
Any integrated EHR/EDC system must meet the specific needs of the investigator and the sponsor, as well as some needs that pertain to both (see Table 2). While the investigator is most concerned with an efficient, consistent workflow and the harvesting of legacy data without retyping, they must also meet the requirements of good clinical practice (GCP), especially 21 CFR Part 312 62(b). Specifically, the investigator is responsible for creating and maintaining case histories on patients, a seemingly simple requirement that is often interpreted in the electronic world as requiring a system whose access is controlled by the investigator and not the sponsor.
Table 2. Needs driving the desire to integrate EHR and EDC
This interpretation derives from the traditional paper-based world of data entry, and is meant to provide checks and balances for the prevention and investigation of fraudulent data. Modern digital technologies such as digital signatures and digital notaries have the potential to provide a substantially higher reassurance that data have not been tampered with after being recorded by the investigator—even if the data were in the hands of a third party or the sponsor.
The sponsor also must be sure that the systems used to collect data meet the requirements of the electronic record, electronic signature (ERES; 21 CFR Part 11) rule as interpreted under a risk-based approach. Few, if any, EHR systems can meet the burdens of the validation and controls mandated by this rule.
Although EHR systems and EDC systems may collect similar data, an EHR system has essentially none of the functionality required for managing trial data after collection. Furthermore, to add such functionality is a much more significant task than one would expect. A good analogy is a comparison of spreadsheet and database software. On the surface, both make tables with rows and columns. However, to convert one program to the other would require rewriting it. An integration of EHR and EDC would likely rely on the EDC system for managing the process of data cleaning and database lock.
One overall goal of system integration is to eliminate source document verification (SDV) entirely. If the data in the research database is an electronic copy of the data in the EHR system, then there is no need to do SDV, or it can be done automatically through digital data comparison. However, much of the older data will likely require SDV, and many sites will first record some values in another place—a worksheet, a separate chart, or a digital personal assistant. A company or monitor must determine on a site-by-site basis which data has a separate source and which has an electronic source. This may not be immediately obvious to the monitor and may not be accurately understood by the site, leading to an increased likelihood of data discrepancies uncovered at an FDA audit.
The most obvious solution to integrate patient care data with clinical research is to directly transfer data from an EHR system into an EDC system. This would have the least impact on an investigator, as they would enter the data exactly as they normally do into the EHR. But the reality is far more complex.
The process of clinical trial data management is an orchestrated, highly controlled interactive process of data review, revision, and sign-off that is highly specific for clinical trials.2 However, EHR systems data entry components are designed to accept structured and unstructured data gathered about a patient, at any time they are evaluated. The result is very different workflows, a very different user interface, and very different database requirements.
The interface of an EHR system does not typically include a form-based input for the type of structured data collection typical for EDC. While it is possible to add this to any particular system, such an eCRF add-in would have to include forms and rules specified by a central, sponsor-based system. This would require a broad acceptance of standards within both the EHR and EDC vendors, and the ability for individual investigators to impose changes on large hospital EHR systems for a particular trial. These capabilities currently do not exist.
Any significant delays imposed by software programming or data integration would be difficult to accommodate in any enterprise-level integration of the EHR and EDC systems. It is highly unlikely that a pharma sponsor would be given access to any site EHR systems for integration activities because of the privacy and technology independence of the site IT systems. Furthermore, such mapping/integration could require very intensive IT resources at a single site, outweighing any advantages that the EHR/EDC integration would bring.
An interface to collect ongoing clinical trial data is only one part of the integration problem. Another issue is the harvesting and collection of legacy data, such as demographics, concomitant medications, and past medical histories. This task is hampered by the structure of many of the EHR databases. Clinical data in EHR systems is collected in a variety of formats, from faxes and scans of handwritten annotations and printed data, to transcriptions of oral dictations, to large text fields. Much, if not most, of this data is not available or even extractable into the type of data that is usable for clinical trial statistical data analysis. In addition, collection of data in an EHR system often does not have the same level of metadata—the typical who, what, when, where, and why collected with each data entry or change—that EDC systems use to manage and report on operational activities in a clinical trial.
Finally, the EHR system would have to be able to transmit any changes made to the original data and be able to transmit additional patient data entered at unscheduled visits to the investigator or any uninvolved third-party clinicians. This would involve a variety of technical, consent, and other logistic hurdles and an ongoing connection between the EDC and EHR system that is beyond a "per visit" connection.
Three fundamental solutions have been proposed for integrating EHR and EDC systems. These solutions are described in the following sections and indicated in Table 3.
The EDC system itself could be used for recording data from patients in clinical trials, and the data then communicated to the EHR system. Ideally, some trigger in the EHR system would automatically open it for the investigator or would open an eCRF portion of the EHR system, as previously described, with the recording of the data in the EDC database. This data would have to be transmitted to the EHR system using data standards, such as the CDISC/HL-7 combined standard currently being developed. The data could be mapped by specific field or converted into a text-based narrative that could be included into the EHR system. In either case, the EHR system at each site would have to allow for the importation of patient-specific data tagged with content identifiers, a functional capability that is not yet available in most systems.
The major issues with electronic capture of patient data involve compliance with regulations, both FDA GCP and HIPAA privacy regulations. The mandate of 21 CFR 312.62(b) requires that an investigator create and maintain case histories on patients. The standard EDC system is now Web-based, with data housed either at the sponsor or more commonly at a third party, neither of which can accept a transfer of investigator responsibilities. This issue is currently the topic of much discussion and debate in the FDA and clinical research community and will likely be resolved over the next several years.
Direct EDC data collection does not eliminate the problems associated with identifying and transferring legacy, lab, third-party, and changed data that may occur outside of a visit window in an EHR system.
Another possible scenario for data collection that integrates EDC and EHR is the development of a simple, third application that would reside on the investigator's computer. This would serve to collect the data and store it locally, while transmitting a copy to both the EDC and EHR system. In fact, a pilot of this architecture has been performed under the CDISC Single Source project.3 The increased complexity of a third application makes this solution difficult to scale, and requires validation and management of the application on the desktop of an investigator. Furthermore, the data will then need to be transmitted and imported into both EDC and EHR systems. The previously raised issues of coordinating data changes and managing legacy and other third-party data is certainly no less complicated with this architecture.
Some EHR vendors have suggested that their software is available at many clinical research sites. By working collaboratively with an EDC vendor, or by incorporating EDC functionality in their system, they argue that pharmaceutical companies could perform a clinical trial limited to their sites. Although this might work in pilot trials, it is not a general solution, as no EHR system dominates the market and pharmaceutical companies are very unlikely to be willing to have site selection limited by their choice of EHR system. This is especially true when EHR systems are specialized for type of practice setting; few Phase III clinical trials can restrict themselves to a particular practice setting.
Finally, it has been suggested that pharmaceutical data management organizations might be willing to collect multiple databases—one from each site or from each "type" of EHR system—and aggregate these to create the research database. Most who are familiar with the rigor of clinical trial data management are very wary of this possibility. Even if the setup of the database could be standards-driven, the task of combining multiple databases is fraught with difficulties and can add significant time, cost, and reduced quality to the process. In fact, this very mechanism was attempted almost a decade ago with the rise of independent site maintenance organizations who wanted to collect and manage clinical trial data using their own data collection tools. Very few, if any, clinical trials were performed with data collected in this way.
Table 3 summarizes key requirements for an EHR/EDC integration and judges the three outlined solutions against them. None of the solutions meets the proposed requirements.
Table 3. Measuring solutions against requirements of an integrated EHR/EDC system
The closest we can currently come to an ideal solution would involve the extension of existing standards to incorporate a representation of an eCRF and simple edit checks. EHR vendors would need to create a method to display the eCRF within their system and transmit the data and metadata simultaneously to the research and clinical trial database, again using data standards. EDC vendors would need to accept such standardized representation of data and metadata. The subsequent data cleaning would be done strictly through the EDC system. Notably, this does not address the transmission of legacy data or the coordination of data changes after initial entry. It is likely that a close collaboration between EDC and EHR vendors could address and solve this problem.
The combination of EHR and EDC is a very important initiative that will achieve great value for all involved. However, it is an initiative that will require hard work and offer limited immediate benefits. Some will see it as the Holy Grail, while others will see it as tilting at windmills. Only a measured, realistic approach aimed somewhere at the middle of these views will bring some of the benefits we all desire.
Paul Bleicher , MD, PhD, is chairman and founder of Phase Forward, 880 Winter St., Waltham, MA 02451, (781) 902-4302, email: paul.bleicher@phaseforward.com
1. Medical Records Institute's Seventh Annual Survey of Electronic Health Record Trends and Usage for 2005, http://www.medrecinst.com/files/ehrsurvey05.pdf.
2. CDISC Single Source Project, http://www.cdisc.org/single_source/about.html.
3. P.A. Bleicher, "Clinical Trial Technology: At the Inflection Point," Biosilico 1, (5) 163–168 (2003).