Outlining the techniques for anonymization of clinical study reports and the identification and redaction of commercially confidential information to comply with EMA's Policy 0070 on trial data disclosure and transparency.
The European Medicines Agency (EMA) is committed to continuously extending its approach to clinical trials data transparency. In October 2014, the agency released Policy 0070/2014, with the purpose to make medicine development more efficient, to foster public scrutiny to clinical study information by the scientific community, and to develop knowledge in the interest of public health, while promoting a better-informed use of medicines.1 According to EMA, “A high degree of transparency will take regulatory decision-making one step closer to EU citizens, and promote better-informed use of medicines. […] access to clinical data will benefit public health in future.”1
The scope of EMA Policy 0070
The scope of the EMA policy on publication of clinical data for medicinal products for human use1 relates to proactively sharing study-level and patient–level clinical data, (i.e., clinical reports and Individual patient data (IPD), submitted under the centralized marketing authorization procedure after Jan. 1, 2015.
The policy serves as a complementary tool ahead of the implementation of the EU Clinical Trial Regulation No. 536/2014.2
The policy does not concern:
These clinical data continue to be made available to external requesters on a reactive basis in accordance with the EMA’s policy on access to documents related to medicinal products for human and veterinary use (POLICY/0043) effective from Dec. 1, 2010.3
The publication procedure
The procedure goes through two sequential phases:
The publication procedure for clinical reports is based on two pillars:
The policy establishes methods for balancing the protection of patient’s privacy, through the anonymization/ de-identification of the protected personal data (PPD), while sharing clinical trial data and topics considered potential commercially confidential information (CCI) for redaction.
Part 1 - Anonymization of clinical reports for publication
On July 6, 2015, EMA issued a guidance to the pharmaceutical industry on anonymization of clinical reports,4 in the context of Phase I of the policy (i.e., the publication of clinical reports on the EMA website). The guidance, whose terms and indications were briefly anticipated during an EMA webinar of June 24 2015, aims at assisting companies by recommending methods, techniques, and processes that could be applied to clinical reports, for the purpose of achieving adequate anonymization while retaining a maximum of scientifically useful information on medicinal products for the benefit of the public. A new release of this EMA guidance, completed by a summary of changes,5 was issued April 11, 2017.6
Marketing authorization holders (MAHs)/applicants have the responsibility for submitting clinical reports that were previously rendered anonymous/de-identified.
Anonymization techniques4,6
The data in the clinical reports must be processed in such a way that they can no longer be used to identify a natural person by using “all the means likely reasonably to be used” by either the controller or a third party.7,8
The same data can be adequately anonymized in different ways, depending on the context of the data release. When selecting the most appropriate technique, the specificities of the clinical data should be taken into consideration.
Anonymization, a field of active research and rapidly evolving, makes available to MAHs/applicants several techniques. Each of them has its strengths and weaknesses. According to the Article 29 Working Party Opinion,8 the techniques that could be applicable to clinical reports are:
Randomization and generalization techniques are recommended in order to optimize the clinical usefulness of the information published.6
Options to establish data set anonymization
Two options are available to establish if the dataset is anonymized:8
1. Demonstrate that, after anonymization, the following actions are no longer possible:
It is up to a sponsor taking due account of the ultimate purpose and use of the clinical reports to decide which option to use (demonstrate that after anonymization all three criteria are fulfilled: singling out, linkability and inference, or perform a risk assessment).6
The sponsor is also in charge of deciding which anonymization techniques to use in order to achieve adequate anonymization, while retaining a maximum of scientifically useful information. The legislation is not prescriptive about the techniques to be used by data controllers.8
i In the context of phase 1 of policy 0070, dataset is the set of clinical reports published by the EMA.
2. Perform an analysis of re-identification risk.
It is important to note that de-identification does not reduce the risk of re-identification of a data set to zero. Rather, the process produces data sets for which the risk of re-identification is very small.9
There are in fact three plausible re-identification attacks on the data by an adversary that need to be protected against, as summarized in Table 2.10
Measuring the risk of re-identification involves selecting an appropriate metric, a suitable threshold and the actual measurement of the risk in the clinical data information to be disclosed. The choice of a metric depends on the context of data release.
Setting an acceptable threshold encompasses:
Once a threshold has been determined, the actual probability of re-identification can be measured. MAHs/applicants are encouraged to use quantitative methods to measure the risk of re-identification as soon as they are in a position to do so.6
EMA recommendation to best achieve anonymization of PPD of trial participants
There are several sections with data results in clinical reports that may contain personal data of trial participants. These include:
In general, clinical overviews (CTD mod. 2.5) and clinical summaries (CTD mod. 2.7) do not contain personal data related to trial participants, with the exception of the Narratives of the Clinical Summary. In addition, some of the tables included in the clinical overviews and clinical summaries may also contain personal data.
Anonymization of direct identifiers and quasi identifiers
A classification of variables into direct and quasi-identifiers for clinical trials has been completed by a PhUSE (Pharmaceutical Users Software Exchange) working group.11
Any direct identifiers (e.g., name, email, phone number, social security number, signature, full address, clinical trial participant numbers, and medical device serial numbers) should be removed, or in the case of unique identifiers, like Patient ID numbers, at least pseudonymizedii. There are established standards for such pseudonymization.12
The Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) identifies 18 direct identifiers within the frame of the Safe Harbor method.13
ii Pseudonymization consists of replacing one attribute (typically a unique attribute) in a record by another. The natural person is still likely to be identified indirectly. Pseudonymisation reduces the linkability of a dataset with the original identity of a data subject.
Quasi-identifiers, which consist mostly of dates, location information, demographics, socioeconomic information, rare diagnoses, concomitant illnesses and medications, and serious adverse events such as death, hospitalization, and birth defects, cannot just be removed as these variables are very useful for the analysis.10 More sophisticated techniques (e.g., generalization) need to be applied to retain the value of these variables but also reduce the probability that these variables can re-identify participants.14 The need to redact quasi identifiers will depend on the following aspects:
It is up to the MAH/applicant to decide which quasi identifiers need to be redacted and which could remain in the reports. The rationale for the decision should be included in the risk assessment section of the anonymization report to be provided to EMA.6
A detailed approach to a wider set of de-identification techniques for quasi-identifiers is available in El Emam et al, 2015.10
De-identification and data sharing: Other available standards
While considering the existing EMA guidance, other recommendations have been produced with the aim of promoting an approach that balances data utility and privacy risk and is applicable across clinical trial data holders.13HIPAA Privacy Rule15 suggests two general approaches to de-identification are exemplified by two different methods:
Both independent authors16,17,18,19,20 and private organizations21,22 have published their own opinions on responsible clinical data sharing, while outlining the evolving role of statisticians in the data sharing and in its success.23 In particular, TransCelerate BioPharma Inc. published a guidance, Data De-identification and Anonymization of IndividualPatient Data in Clinical Studies. Other standards on anonymization are available.11,24,25
Personal data of individuals other than trial participants
The EMA performed a privacy impact assessment (PIA) to establish the functionalities of the database, in particular with regard to the data fields to be made publicly accessible.
Personal data of individuals other than patients (i.e., investigators), sponsor staff, and MAH/applicant staff will not be published, with the exception of the sponsor and coordinating investigator signatories of the clinical study report and the identities of the investigator(s) who conducted the trial and their sites.
In any case, the contact details and signatures of these individuals should be redacted. Data pertaining to the above exception in other parts of the clinical study report (CSR) will be redacted, as they may give away geographical information (e.g., site number, site address, investigator names) that could be linked to patients and, hence, may enable their identification.6 It was noted that the reductions published until April 2017 were largely inconsistent with the EMA guidance in this regard.26
Part 2 - Identification and redaction of commercially confidential information
On Dec. 19, 2016, EMA published a new release of the External guidance on the identification and redaction of commercially confidential information (CCI) in clinical reports submitted to the agency for the purpose of publication in accordance with EMA Policy 0070.27 The key contents of these guidelines were anticipated during the EMA webinar28,29 concerning the Policy 0070.
The guidance is a working tool and a reference document for pharmaceutical companies, aimed at supporting them for the preparation of their justifications regarding CCI in documents that fall under the scope of the Policy 0070. Annex 3 of the policy provides MAH/applicants with “redaction principles” to identify certain types of information that can potentially be considered CCI.1 EMA will scrutinize the justification for the redaction of CCI in order to assess whether the definition of CCI applies.
While providing a comprehensive overview and syncretic details on how the redaction of CCI is to be handled within the context of Policy 0070, the guidance ensures a common understanding of what can or cannot be considered CCI within clinical reports. It also ensures a good quality of the justifications for the proposed redactions.
Points to consider for the preparation of the redaction proposal of a clinical report
CCI shall mean any information contained in the clinical reports submitted to the EMA by the MAH/applicant that is not in the public domain or publicly available and where disclosure may undermine the legitimate economic interest of the MAH/applicant.26
Prior to proposing any redactions, the MAH/applicant should be aware of the level of information already available in the public domain concerning its product’s development, scientific knowledge, and advancements within the relevant therapeutic area(s). Such preparatory work by the MAH/applicant is essential, as it enables an expedited consultation process, and thereby reduces the probability that EMA will reject proposed redactions because the information is already in the public domain.
Information that EMA does not consider CCI
Information that may be considered CCI
The information listed the Policy 0070 Annex III1 may be considered CCI and, therefore, is supposed to be adequately justified.
The Redaction Principles should not be perceived by the MAH/applicant as an open and unconditional invitation to propose, on a regular basis, the redaction of information.
If the MAH/applicant identifies a piece of information-a word or figure, part of a sentence, part of a paragraph-that it wishes including among the proposed redactions, it has to ensure that the information in question:
The Justifications suggested in Annex III are not considered relevant, and, therefore, will be rejected.26
The MAH/applicant is discouraged from proposing the redaction of entire pages, subsections of a report, or full tables, especially when, in their view, only some sentences within the text or some specific figures within the tables fall under the types of information described in Annex 3.
The justification table and its use
The EMA considers the Justification Table a living document reflecting the justifications the company puts forward and the EMA’s conclusions (see Figure 1; click to enlarge).
Click to enlarge
According to Annex III, the justification table should contain justifications for all pieces of text considered as CCI and proposed for redaction. Should the company highlight a piece of text proposed for redaction, but fail to explain its redaction in the justification table, the proposal will be considered invalid and sent back to the company for clarification.iii
The justification table is used as a communication tool between EMA and the sponsor during the whole redaction consultation process.
Each table should list all proposed CCI redactions of a clinical report, and should be fully completed by the MAHs/applicants. The justification table is not part of the documents to be published.
Expected level of details for the justification
The EMA External guidance26underlines that the applicants are expected to submit a specific, pertinent, relevant, not overstated, and appropriate justification for each of the pieces of text proposed to be redacted.
The justification wording has to meet the following criteria:
According to a recent paper by El Emam,26 “It is evident from the redaction approaches that have been applied thus far that the manufacturers have erred toward being more conservative and tilting the privacy/utility balance toward protecting patient privacy.”
iiiEach submitted clinical report requires a separate justification table that has to be submitted as Word document. Accordingly, the MAH/applicant is expected to indicate clearly which justification table corresponds to which clinical report.
Evaluation process of the proposed CCI redactions
If EMA considers the justification non-sufficiently detailed, additional clarifications will be requested. Failure to provide the requested clarifications within a reasonable time frame would render the available justification insufficient.
Should the agency consider the provided justification not sufficiently specific or too vague, the following rejection codes will be included in the justification table: CCI – Rejection 04 – Insufficient justification.
Whenever the justification provided by the MAH/applicant does not correspond to/match the (type of) information proposed for redaction (i.e., is not relevant to the information proposed to be redacted), the following rejection code will be used in the justification table: CCI - Rejection 05 – Irrelevant justification.
The current debate on the nature of CCIs
It is worth mentioning that the debate on the definition of CCI is still open. A draft policy EMA published in June 201330,31 suggested that CSRs do not contain CCI and, therefore, could be released with no redactions. Later, commenting on the conclusions of the European Ombudsman about the EMA’s partial refusal to give public access to studies related to the approval of a medicinal product,32,33 the agency’s spokesman argued that there is no agreed or binding definition of CCI in European Union legislation, and that its own guidance “makes clear that the vast majority of the information contained in clinical reports is not considered CCI. The guidance clarifies which type of data the EMA would typically refuse as being CCI and how the redaction of such data will be handled.”34
Direct experience of anonymization and redaction
According to our experience, it is quite difficult putting into practice the theoretical principles expressed in the EMA Guidance and in the other reference documents available so far. This is due not only to the amount of information that, within the clinical reports, can be subject to interpretation, but also to the lack of practical examples in the guidelines. An improvement in this direction has been shown by the most recent update of the External guidance.6
Anonymizing and redacting the information contained in clinical trials synopses, we agreed with the sponsors to de-identify:
The only exception in the anonymization of this information was:
We suggest keeping clear the investigators’ titles and positions inside their organizations/institutions.
As far as redaction is concerned, we agreed with the sponsors to prudentially redact the batch numbers of the test product and the reference therapies and their relative expire date/recheck date to avoid any possible identification.
In the case of clinical reports publication under Policy 0070, all clinical reports submitted as part of a regulatory application are subject to publication and, therefore, need to be redacted. Specific attention should be dedicated to the Leaf title naming in index XML of eCTD submission and the corresponding file names for the PDF documents.
The completion of the redaction procedure also involves the editing of:
Conclusions
The publication of clinical data for medicinal products for human use is one the clinical trial transparency (CTT) procedures that MAHs/applicants are supposed to cope with from now on. Nevertheless, only some of them demonstrate to have a clear view of this complex and time-consuming activity. It is our opinion that CTT will impact both the regulatory status of medicinal products and, in the mid-term, the MAH/applicant’s reputation. We, therefore, encourage MAH/applicants to dedicate the needed resources to the CTT activities. These include the procedure to upload information of clinical trials to the EudraCT platform, the publication of CSRs supporting centralized MAA, the editing of the layperson summaries, and the disclosure of patient-level data to specific requests from the scientific community.
M. Zaninelli, MA, and E. Ornago, MSc, both with Maxer Consulting s.r.l.; A. Ferrari, MD, PhD, with Erydel S.p.A.
References
1. EMA policy on publication of clinical data for medicinal products for human use (Policy 0070). http://www.ema.europa.eu/docs/en_GB/document_library/Other/2014/10/WC500174796.pdfh
2. Regulation (EU) No 536/2014 of the European Parliament and of the Council of 16 April 2014 on clinical trials on medicinal products for human use, and repealing Directive 2001/20/EC http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32014R0536&from=it
3. European Medicines Agency policy on access to documents (related to medicinal products for human and veterinary use) - POLICY/0043 http://www.ema.europa.eu/docs/en_GB/document_library/Other/2010/11/WC500099473.pdf
4. Dias M. Guidance on the anonymization of clinical reports for the purpose of publication in accordance with policy 0070. http://www.ema.europa.eu/docs/en_GB/document_library/Presentation/2015/09/WC500194087.pdf
5. Summary of changes to the “External guidance on the implementation of the European Medicines Agency policy on the publication of clinical data for medicinal products for human use” http://www.ema.europa.eu/docs/en_GB/document_library/Other/2017/04/WC500225879.pdf
6. External guidance on the anonymization of clinical reports for the purpose of publication in accordance with EMA Policy http://www.ema.europa.eu/docs/en_GB/document_library/Regulatory_and_procedural_guideline/2017/04/WC500225880.pdf
7. Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data, http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31995L0046:en:HTML
8. The Working Party on the Protection of Individuals with regard to the Processing of Personal Data. Article 29 Data Protection Working Party. WP216. Opinion 05/2014 on Anonymization Techniques. 2014. http://ec.europa.eu/justice/data-protection/article-29/documentation/opinion-recommendation/files/2014/wp216_en.pdf
9. Information and Privacy Commisioner of Ontario, De-identification Guidelines for Structured Data, 2016. https://www.ipc.on.ca/wp-content/uploads/2016/08/Deidentification-Guidelines-for-Structured-Data.pdf
10. El Emam K et al, De-identifying Clinical Trials Data, Applied Clinical Trials, 2015, http://www.appliedclinicaltrialsonline.com/de-identifying-clinical-trials-data
11. PhUSE De-Identification Working Group, De-Identification Standards for CDISC SDTM 3.2, 2015, http://www.phusewiki.org/docs/Conference%202015%20DH%20Papers/DH01.pdf
12. Health informatics. Pseudonymization, ISO, International Standard ISO/TS 25237:2008, 2008, Data on file
13. Tucker K et al, Protecting patient privacy when sharing patient-level data from clinical trials, BMC Medical Research Methodology 2016, 16(Suppl 1):77, https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-016-0169-4
14. El Emam K and Arbuckle L, Anonymizing Health Data: Case Studies and Methods to Get You Started. O’Reilly, 2013, http://ebook-dl.com/item/anonymizing-health-data-case-studies-khaled-el-emam
15. US Department of Health and Human Services. Guidance Regarding Methods for Deidentification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule 2012. https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/
16. Aggarwal CC, Yu PS. A General Survey of Privacy-Preserving Data Mining Models and Algorithms. In: Aggarwal CC, Yu PS, editors. Privacy-Preserving Data Mining: Models and Algorithms. Boston, MA: Springer US; 2008. p. 11–52., Data on file
17. Mello MM, Francer JK, Wilenzick M, Teden P, Bierer BE, Barnes M. Preparing for Responsible Sharing of Clinical Trial data. N Engl J Med. 2013;369:1651–8. http://www.nejm.org/doi/pdf/10.1056/NEJMhle1309073
18. Hughes, S., Wells, K., McSorley, P. and Freeman, A. Preparing individual patient data from clinical trials for sharing: the GlaxoSmithKline approach. Pharmaceut. Statist. 2014; doi: 10.1002/pst.1615. Data on file
19. Gibson B, Multi-Sponsor Data Transparency: A Group Approach To Sharing, Phuse, 2014, http://www.phusewiki.org/docs/Conference%202014%20TT%20Papers/TT04.pdf
20. Taichman DB, Backus J, Baethge C, Bauchner H, de Leeuw PW, Drazen JM,Fletcher J, Frizelle FA, Groves T, Haileamlak A. Sharing Clinical Trial Data: A Proposal from the International Committee of Medical Journal Editors. PLoS Med. 2016;13(1):e1001950. http://www.nejm.org/doi/pdf/10.1056/NEJMe1515172
21. European Federation of Pharmaceutical Industries and Associates (EFPIA) – PhRMA. Principles for Responsible Clinical Trial Data Sharing: Our Commitment to patients and researchers. 2013.
http://www.phrma.org/sites/default/files/pdf/PhRMAPrinciplesForResponsibleClinicalTrialDataSharing.pdf
22. Transcelerate - Data De-identification and Anonymization of Individual Patient Data in Clinical Studies – A Model Approach, http://www.transceleratebiopharmainc.com/wp-content/uploads/2015/04/CDT-Data-Anonymization-Paper-FINAL.pdf
23. Manamley N, Mallett S, Sydes MR, Hollis S, Scrimgeour A, Burger HU, Urban HJ.Data sharing and the evolving role of statisticians. BMC Med Res Methodol. 2016 Jul 8;16 Suppl 1:75. doi: 10.1186/s12874-016-0172-9, http://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-016-0172-9
24. Information Commissioner’s Office (ICO) Code of Practice. Anonymization: managing data protection risk https://ico.org.uk/media/1061/anonymisation-code.pdf
25. Sharing clinical trial data: Maximizing benefits, minimizing risk. Institute of Medicine (IOM) https://www.nap.edu/download/18998
26. El Emam K, An Analysis of Anonymization Practices in Initial Data Releases Pursuant to EMA Policy 0070, Applied Clinical Trials, April 13, 2017 http://www.appliedclinicaltrialsonline.com/analysis-anonymization-practices-initial-data-releases-pursuant-ema-policy-0070
27. Chapter 4 - External guidance on the identification and redaction of commercially confidential information in clinical reports submitted to EMA for the purpose of publication in accordance with EMA Policy 0070http://www.ema.europa.eu/docs/en_GB/document_library/Regulatory_and_procedural_guideline/2016/12/WC500218567.pdf
28. Henry-Eude Anne-Sophie Redaction Consultation Process, Assessment of justification for proposed redactions of commercially confidential information, June 24, 2015, London (webinar) http://www.ema.europa.eu/docs/en_GB/document_library/Presentation/2015/06/WC500188860.pdf
29. Henry-Eude Anne-Sophie Guidance to pharmaceutical industry on redacting commercially confidential information (CCI) in clinical reports, June 24, 2015, London (webinar) http://www.ema.europa.eu/docs/en_GB/document_library/Presentation/2015/06/WC500188857.pdf
30. EMA (European Medicines Agency). 2013. Publication and access to clinical-trial data. www.ema.europa.eu/docs/en_GB/document_library/Other/2013/06/WC500144730.pdf (accessed October 15, 2014).
31. Wathion, N., and EMA. 2014. Finalisation of EMA policy on publication of and access to clinical trial data. http://www.ema.europa.eu/docs/en_GB/document_library/Report/2014/09/WC500174226.pdf (accessed December 16, 2014).
32. O'Reilly E, Decision on own-initiative inquiry OI/3/2014/FOR concerning the partial refusal of the European Medicines Agency to give public access to studies related to the approval of a medicinal product, European Ombudsman, 2016. https://www.ombudsman.europa.eu/en/cases/summary.faces/en/68117/html.bookmark
33. Gøtzsche PC, AbbVie considers harms to be commercially confidential information: sign a petition, BMJ 2013;347:e7569. Data on file.
34. Silverman E, European ombudsman urges regulator to get tough on redacting study data, Stat, 2016, www.statnews.com/pharmalot/2016/06/10/clinical-trial-data-trade-secrets
Accelerating Clinical Trial Design and Operations
Fully-integrated, component-based CDMS offers flexibility, customization, and efficiency.