100% Cloud-based Genome Center Integrating Large Healthcare Data Flows

photo: The Cancer Genome Atlas
photo: The Cancer Genome Atlas

In a previous post, our new CMO, David Shaywitz, talked about his vision for DNAnexus and its role in helping fulfill the promise of genomic medicine:

“DNAnexus represents a natural home for these aspirations, offering a compelling, secure, cloud-based data management platform, an enabling tool for any healthcare organization – academic medical center, healthcare system, biopharma company, payor – who recognizes that getting a handle on large healthcare data flows is rapidly becoming table stakes, and that figuring out how to manage and leverage genomic data is a wise place to start.”

Fast-forward two months…  This week, we announced exciting progress in our efforts to accelerate genomic medicine.  The DNAnexus cloud-based genome informatics and data management platform is powering a number of collaborations between Regeneron Genetics Center (RGC) and its leading healthcare provider partners.

In a RGC press release, they announced these new collaborators, which include the Geisinger Health System, Columbia University Medical Center, Clinic for Special Children, and Baylor College of Medicine. The RGC will be using the DNAnexus platform to integrate sequencing data with de-identified clinical records from patient volunteers. To date, the RGC has sequenced samples from more than 10,000 individuals and is currently sequencing more than 50,000 samples per year.

The Geisinger collaboration, which has been described as the largest clinical sequencing project in the U.S., is on track to sequence more than 100,000 patient volunteer samples. This DNAnexus-powered initiative has resulted in the first 100% cloud-based biopharma genome center, and is now operating at scale.

Next-generation sequencing technologies, like Illumina’s HiSeq 2500 or X Ten platform, have reduced the cost and increased the speed of DNA sequencing outpacing Moore’s Law to the point where the new bottleneck is genome informatics. To address this issue, companies like Regeneron are adopting cloud-based solutions to handle the massive volume of sequencing data.

DNAnexus provides the technology backbone that enables the sharing and management of data and tools around large volumes of sequencing data between the RGC and its healthcare collaborators. Currently the RGC is processing more than 1,000 exomes per week and sharing the data easily and safely with their collaborators.

In order to improve patient care and ultimately human health, the integration of genomic and phenotypic data needs to happen on a massive scale (something David has recently discussed from the perspective of phenotype here and here). Combining large cohorts of deeply-phenotyped individuals with their genomic data offers a wide range of medical applications, the most obvious being a more personalized approach to medical interventions such as which therapy might work best for a given individual. These data can also be used to aid in the development of new companion diagnostics and clinical trial participant selection. As an article in GigaOM put it this week: Cloud Computing is Coming for Your DNA, and it Will Lead to Better Drugs and Health Care.

These collaborations are powerful examples of how the DNAnexus platform is enabling an integrated approach between biopharmaceutical companies and their partners to accelerate the research and discovery process. As David said, healthcare industry leaders who prioritize the management of large healthcare data flows will emerge as the pioneers who help us realize the full vision of precision medicine –delivery of the optimal therapy to the right patients at the right time – ideally before they are sick.

Towards Fulfilling The Promise Of Genomic Medicine

David ShaywitzI was in eleventh grade when I first discovered The Eighth Day of Creation, Horace Freeland Judson’s wonderful, eloquent, deeply inspiring account of the history of DNA and the origins of molecular medicine.  I’ve been hooked ever since.

My choice of college was driven by the opportunity to study with many of the key players in the story – and sealed when a biochemistry department tour guide pointed to what he said was the exact centrifuge used in the famous Meselson-Stahl experiment.

It was thrilling to learn about molecular biology, and even more exciting to become a molecular biologist, learning how to extract and recover DNA, how to splice it, and how to sequence it.

I spent my first year in graduate school studying tumor suppressor genes, then found myself seduced by the elegant power of yeast genetics, and chose this area for my thesis research.

When I returned to medical school and entered the clinic, I was surprised by what I discovered.  While medicine is often regarded as applied science, I was reminded every day just how limited and fragile our understanding of health and disease really is.  The molecular basis of illness often escapes us, and even in the relatively rare instances where the biology is well understood, a cure can be difficult to come by – consider sickle cell disease (considered the first illness defined at a molecular level), for instance, or cystic fibrosis.

Along with co-authors Dennis Ausiello and Joseph Martin, I wrote in 2000,

“Physicians and physician-scientists have become increasingly concerned with ensuring that the tremendous advances they have seen in basic science find expression in clinical practice.  While an understanding of the genetic basis of disease allows us to consider the development of molecular therapies, we have learned not to underestimate either the magnitude of this undertaking or the extent of preparation required.  Indeed, this endeavor is much more difficult than most have anticipated.

As Goldstein and Brown recently noted, paraphrasing Magritte, ‘a gene sequence is not a drug,’ and although the development of rational therapy for a disease may require an understanding of its molecular basis, the path from mechanistic understanding to clinical treatment is often difficult to define and hard to predict.  Proteins often behave differently in test tubes than in cells, and cells behave differently in culture than as part of a vital organism.  Finally, a patient’s experience of disease reflects more than simply an underlying biological defect.  It is, to quote Eric Cassell, ‘a process inextricably bound up with the unfolding story of this particular patient.’  Thus, the critical question we are now struggling with as physicians and physician-scientists is how to avail ourselves of the advances in molecular biology without losing sight of our primary goal – the care and treatment of our patients.”

In the fourteen years since this was written, our tools have changed, our computational power has improved exponentially, the volume and velocity of data have grown at almost unimaginable rates, yet the fundamental challenge remains the same: how can we translate promising science into improved patient care?

More generally, how can we harness large data flows to improve the human condition? I believe this represents the central scientific challenge — and greatest scientific opportunity — faced by our generation.

The starting and ending point for this vision must be patients.  Data collected from a patient belong to the patient.  With appropriate permissions and rigorous safeguards, these data can be shared in de-identified fashion, in effort to accelerate the sort of rapid knowledge turns Andy Grove has insightfully discussed, and that Josh Sommer of the Chordoma Foundation so poignantly described at the first Sage Bionetworks Congress.

Progress will require brave exploration — memorably championed by Pixar’s Ed Catmull — combined with seamless collaboration, so that, as Cloudera’s Jeff Hammerbacher says, everyone can “party on the data.”

Genomic sequencing data represent a natural foundation, not only providing an essential “parts list” (as Eric Lander has nicely described it), a sense of what you are starting with, but increasingly providing dynamic visibility into important clinical pathophysiology, such as the evolving molecular characteristics of circulating tumor cells, or treatment-resistant bacteria or virions.

I’m especially excited by the opportunities at the intersections of large data streams, loci that, like overlapping academic disciplines, promise to be especially rich sources of novelty and insight.

DNAnexus represents a natural home for these aspiration, offering a compelling, secure, cloud-based data management platform, an enabling tool for any healthcare organization – academic medical center, healthcare system, biopharma company, payor – who recognizes that getting a handle on large healthcare data flows is rapidly becoming table stakes, and that figuring out how to manage and leverage genomic data is a wise place to start.

I’m excited – and feel inordinately privileged – to join the DNAnexus team, and to work with passionate colleagues from throughout the healthcare system, in both the private and the public sector, and explore together how we can move from genome to value, from data to impact, from information to cure.


David Shaywitz, MD, PhD is the Chief Medical Officer of DNAnexus, and the co-author, with Lisa Suennen, of Tech Tonics: Can Passionate Entrepreneurs Heal Healthcare With Technology? (Hyperink Press, 2013).  You can follow him on Twitter: @dshaywitz.