The Rising Tide of Genomic Data Points to the Cloud

No other market segment has felt the profound impact of the cloud more than the life sciences industry. In March, a major roadblock was eliminated when the National Institutes of Health lifted its ban on the use of government datasets (dbGaP, TCGA, etc.) in the cloud and updated its security best practices white paper. In the past, individual researchers would download data hosted from a variety of locations, attempt to integrate their own data, and run analyses on their local hardware; a time-consuming endeavor wrought with headaches. This approach has become unsustainable given that data sizes have grown exponentially as the cost of genome sequencing has been driven down. There is now a collective push within the genomics research community to create a cloud commons, something in which DNAnexus wholeheartedly believes.

Just how much data are we talking about? According to recent research, Earth contains around 5.3 x 1037 DNA base pairs. They add: “By analogy, it would require 1021 computers with the mean storage capacity of the world’s four most powerful supercomputers (Tianhe-2, Titan, Sequoia, and K computer) to store this information.”

Platform logoFortunately, no one has been asked to manage the DNA for our entire planet’s biomass, but the research points to a very real challenge. With this rising tide of data comes the need for more computational resources, and the question of whether to build or buy infrastructure comes into play. A recent article in The Platform, takes a fascinating dive into how the genomics community is utilizing life sciences clouds. In her article, author Nicole Hemsoth (@NicoleHemsoth) raises the issue of “what life sciences companies are missing is a management system for dealing with petabytes of data and billions of objects.”

She continues to discuss how as the sophistication of data management, storage, compute, security and compliance features become hardened, bursting into the cloud is the most efficient way to utilize local and cloud resources. While most large-scale genome centers have their own local infrastructure, their workloads tend to occur in spikes. In order to mitigate overprovisioning, genome centers are finding their sweet spot by combining in-house infrastructure with bursting into the cloud. And with the advent of genomics cloud solutions such as DNAnexus, there are ways to seamlessly integrate workloads to the cloud.

Another notable trend we’re seeing is how life science companies like Regeneron Genetics Center are moving to a 100% cloud-based solution. Instead of the heavy investment that comes with managing and maintaining IT infrastructure, companies can invest in intellectual property; shifting their focus to R&D to accelerate medical discovery.

While freeing up bandwidth on building and maintaining local hardware is a big appeal for the cloud, the real reason institutions decide to go with DNAnexus is for its genomics platform’s state-of-the-art compliance and security measures. While it’s true Amazon Web Services offers a lot of built-in features to ensure security and privacy and potentially any skilled engineer can go out and spin up Amazon EC2 instances themselves, when handling personal health information it’s just not that simple. DNAnexus has invested a tremendous amount of resources in creating a genomics platform that complies with ISO 27001 international security standards and provides data provenance to certify all operations can be tracked and reported for up to 6 years.

Just as there isn’t one way to genotype, there isn’t one way to take your data to the cloud. The field is constantly evolving, which means you’re constantly doing variant call bake-offs, working with many different tools to assess whether you are getting the correct variants of interest. The important question to ask is how will you manage all these diverse data and research requirements? Do you want to do it yourself or go with a proven genomics platform that offers a complete set of systems already in place to control and manage your data? DNAnexus can help. Drop us a line when you’re ready.

DNAnexus Expands its Global Network for Genomic Medicine to China

Global Network for Genomic MedicineIt’s official – DNAnexus is expanding its cloud platform to China. A $15 million strategic investment and alliance with WuXi PharmaTech will bring cloud-based genomics to China and Chinese genomics to the world. We’ve been steadily building the global network for genomic medicine, and now we can serve the Chinese life science market.

With our leading genome informatics and data management cloud technology, DNAnexus is connecting WuXi NextCODE’s sequence data analysis suite and WuXi PharmaTech’s global, open access drug discovery and development services on a single platform. You can read more about our announcement in:

 

Why China?

China currently holds more than 20% of the world’s sequencing capacity and with a population approaching 1.4 billion and its ability to test and enhance new ideas, China represents the largest market in the world for next-generation sequencing. DNAnexus customers are global in scope and needed a China cloud solution to support their efforts. Pharmaceutical customers using contract research services in China will now have a seamless end-to-end HIPAA compliant platform to expand clinical research with collaborations and datasets and speed the development and delivery of DNA-based diagnostics.

This strategic alliance provides not just a China solution, but also a global solution. It unites the leading technologies to enable any company or institution to store and interpret their sequence data and collaborate with colleagues around the world through a single platform. For the first time, users will be able to use their genomic data seamlessly in tandem with the open access capability and technology (e.g. diagnostic test validation or FDA submission services) that WuXi offers to the global pharmaceutical and medical device industry.

Empowering Virtual Diagnostics

China is just one piece of the puzzle, albeit a big one. We envision this alliance fueling innovation and transforming large-scale sequence data business models, and laying the groundwork for virtual diagnostics enterprises to develop and deploy clinical and companion diagnostics in the cloud. By connecting the DNAnexus compliant cloud-based bioinformatics platform with WuXi leading genomics and R&D services, companies are able to focus on their intellectual property, and the development of algorithms and pipelines. WuXi and DNAnexus will facilitate groundbreaking virtual test development and deployment on a global basis, without the need for capital investment in test development and compute infrastructure.

Yes, we are scaling up our engineering team!

We are looking for smart motivated people. Leave your lab coat at home. Our core is building great software, the technology that powers our genomics data platform. Learn more about career opportunities at DNAnexus.

We are on the forefront of precision medicine, bringing together proven informatics for population-scale genome sequence data, the latest secure cloud technology, and the global reach of the Internet.

Obama’s Precision Medicine Initiative: DNAnexus is There

David Shaywitz_White HouseLast week, President Obama held a meeting unveiling details about the Precision Medicine Initiative, an audacious research effort to revolutionize how we practice medicine and ultimately improve human health. At the center of this bold new initiative lies a huge new biobank containing electronic medical records and genetic information on more than a million Americans. Our very own Chief Medical Officer, David Shaywitz, joined the other personalized medicine stakeholders at the White House to weigh in as President Obama made the historic announcement. You can read David’s own first hand account of his visit to the White House here.

Developing cures for complex diseases is incredibly complicated, and the President’s initiative requires long-term vision. Already, the underlying sentiment seems to be that the reality of genomic medicine is here today, in the case of cancer, and targeted therapies are becoming increasingly common. But the realization of a more complete understanding of human genetics, one that will drive discovery and improve human health, requires deep, accurate, and accessible integration of genomic and phenotypic data from millions of people.

The development of a US biobank will require three distinct executional elements: creating, integrating, and analyzing complex data sets. Each of these elements presents unique and difficult challenges, but experience tells us that none are impossible. President Obama’s Precision Medicine Initiative calls for national implementation of solutions very similar to those developed by DNAnexus in collaboration with Regeneron Genetics Center and Geisinger Health System, and Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE), Baylor College of Medicine’s Human Genome Sequencing Center (HGSC), and other partners.

 

Robert Plenge, Heidi Rehm, David Ledbetter, Robert Nussbaum, waiting to enter White House (Photo: D. Shaywitz)
Robert Plenge, Heidi Rehm, David Ledbetter, Robert Nussbaum, waiting to enter White House (Photo: D. Shaywitz)

A cloud-based genome informatics and data management platform like DNAnexus combines state-of-the-art security with fluid data sharing among researchers, providing a collaborative environment that facilitates and promotes insight and discovery. And that’s exactly what the White House is betting on. These are exciting times, and we are thrilled to be participating, alongside our partners, at the front lines of innovation and policy.