In light of the Snowden revelations and subsequent NSA scandal, there has been a spotlight directed upon data privacy. It is against this backdrop that the EU is developing a new set of privacy regulations, while in the UK a pilot program for collecting and sharing health data has come under intense scrutiny.
The proposed EU General Data Protection Regulation (GDPR) is currently working its way through the EU’s legislative process, and most of the proposal has been well received. However, there is a group of researchers, clinicians, and scientists that are up in arms over amendments adopted by the European Parliament Committee on Civil Liberties, Justice and Home Affairs (LIBE). They argue that although as a whole the proposal is a welcomed remedy, the impact these specific changes in the law could have on research would be drastic. Specifically, Article 81 Paragraph 2 allows “processing of personal data concerning health” for research purposes only with the consent of the individual, and could possibly preclude the use of broad consent for unspecified research that is currently in use by many biobanks. This provision could severely limit the use of data for large population based epidemiological studies. Many researchers argue that the previous safeguards of oversight and approval by bioethics committees were sufficient to protect individual privacy in cases where specific consent was impractical. However, this legislation comes at a time when many demand higher control of their personal health data, and both governments and research institutions must work to respond accordingly.
There is an exception which allows for research that is of “exceptionally high public interest” to be exempt from the requirement of consent. But the determination of what meets this requirement is vague and it is left to each Member State. This would in effect create a patchwork of exceptions to the GDPR, preventing the one continent, one law goal, at least for health research. Furthermore, even if exempt the GDPR requires strict anonymization if possible, or if not at least pseudonymization. These requirements may seem reasonable, but could cause much research to become impracticable. One study has shown that pseudonymization could have a real impact on the results of studies that use such data. Specifically, pseudonymization creates a risk of missed linkages between datasets. The analysis showed that if 5% of linkages (between a cancer registries and death registries to calculate survival rates) are missed, the survival rate is overestimated by a significant 10%. Research institutes from across the EU have banded together to call for a discussion about the proper balance between privacy and research utility, and what is in the best public interest in the long run.
Meanwhile, the National Health Service (NHS) of England has introduced care.data to widespread criticism and concern. It is designed as a massive program to collect, store, and distribute data for both research and commercial purposes. NHS planned to integrate data collection from local general practitioners into a national database that could then be mined and analyzed. Launching this plan into the current climate of privacy concerns would certainly take a delicate touch. Instead, the rollout has been plagued with missteps and is indefinitely on hold as the plan is being reevaluated. To begin, data collection was based on an automatic opt-in, a choice which maximizes the data collected but risks public objection. In an attempt to inform the public a mailing was sent to all households in England. However, only one third of patients recall receiving any pamphlet, according to a BBC poll, and there was mass misinformation about what was being collected, and how it would be used. Next, concerns about the prospect of selling the data to commercial 3rd parties were raised. The law was quickly amended to preclude this, with the NHS promising to not release the data unless there was a clear public benefit. Poor communication leading to widespread opposition has led to the NHS pushing back the expected rollout date, and eventually capitulating by removing all hard deadlines, saying that it will be released whenever it is ready.
The public outcry to this program was quick and decisive because the concern for health privacy is real. However, regulation that seeks to protect privacy at all costs is short-sighted. Technology provides a treasure trove of data that is just waiting to be collected and analyzed in epidemiological studies, or a program like care.data. As a society we must decide how much we value health data privacy, with an open dialogue of the risks and benefits. A balance must be struck, and important in the calculus will be the trust the public is willing to place in governments or other bodies to hold our data securely, trust which may be in serious doubt after the Snowden revelations.