In January 2015, President Obama unveiled the Precision Medicine Initiative, an audacious research effort to revolutionize how we practice medicine and ultimately improve human health. Nearly one year later, US Vice President Joe Biden announced the $1 billion Moonshot to Cure Cancer, aiming to translate advances in genomics into treatments. To see these initiatives to fruition, we all need to work together to coordinate across silos and increase access to information.
Open access for sharing genomic data is not a new idea. The completion of the Human Genome Project and the 1000 Genomes Project showed us how the broad sharing of data generated by genomic research can maximize utility. At DNAnexus, we believe in fostering a culture of openness in genomic research to allow for medical breakthroughs. There are many troves of genomic data, but the mechanism for combining them is far from ideal.
Most data sharing in cancer genomics research has been centralized through rich, yet controlled-access databases like The Cancer Genome Atlas (TCGA) or International Cancer Genome Consortium (ICGC) –both of which properly approved researchers can easily access on the DNAnexus Platform. The access restrictions are structured with the worthy goal of protecting the privacy of individuals donating their samples and data to science, since access to genomic data could hypothetically lead to their re-identification. But arguably, by limiting the access to these datasets we are hampering faster progress and greater reach to patients.
In this spirit, an open access (OA) pilot for freely sharing cancer genomic data was established by a research team at the Human Genome Sequencing Center at Baylor College of Medicine and Texas Cancer Research Biobank (TCRB). For the first time, genomic sequencing data from seven human cancer cases with matched normal are freely available to anyone. Users of the data are simply asked to not attempt to re-identify the participants.
Beyond the dataset itself, the pilot project’s salient contribution is the process developed for participant education and consent. Can cancer patients –with all the physical and psychological challenges they endure, and usually without extensive prior background in biology and genetic privacy– give truly informed consent for the benefits and risks of open-access data sharing? The rigorous protocol applied in this pilot indicates that many indeed possess the capacity, and the desire too.
Controlled-access research datasets will remain a reality and DNAnexus will continue our recognized leadership in cloud security and protection for both research and clinical applications. But the open-access TCRB cases –and, we hope, others like it to come– provide an opportunity for the research community to freely experiment with “real” cancer genomics data, rather than artificial simulations, and refine methods to better analyze controlled-access cases as well, ultimately advancing cancer research.
DNAnexus is grateful to the anonymous patients and our HGSC colleagues for providing this opportunity. We’re proud to participate in this and other innovative cancer data sharing initiatives, like the exciting public/private partnership with ITOMIC led by University of Washington’s Tony Blau, and projects unfolding within the Global Alliance for Genomics and Health. A copy of the open-access TCRB data, conditions of use, and the HGSC’s Mercury informatics pipeline is available now for DNAnexus Platform users.