Once in a Blue Moon Competition: precisionFDA Truth Challenge

The FDA, the Global Alliance for Genomics and Health (GA4GH)  and National Institute for Standards and Technology (NIST) recently teamed up to create a once-in-a-blue-moon challenge for genomic scientists! Dubbed the precisionFDA Truth Challenge, genomic innovators were invited to test their informatics pipelines on two datasets, the well-characterized Genome in a Bottle’s (GiaB) NA12878 (HG001) reference sample and a new reference sample HG002, of which the results were unknown.


PrecisionFDA is an online, cloud-based, virtual research space where members of the genomics community can experiment, share data and tools, collaborate, and define standards for evaluating analytical pipelines. Community members span academia, industry, healthcare organizations and government.  All of these organizations are working together to further innovation and develop regulatory science around NGS tests. So far, the community currently includes more than 1,500 users across 600 organizations, with more than 10 terabytes of genetic data stored.

This is the second challenge issued through precisionFDA, following the precisionFDA Consistency Challenge.   The Truth Challenge is about discovering the consistency and accuracy of informatics pipelines when analyzing a human sample whose truth data is unknown. NIST and GiaB released the truth data May 26, 2016, after the close of the challenge.

What makes this challenge so exciting?

NIST released NA12878 in 2014, the first gold standard whole human reference genome, in collaboration with GiaB and the FDA. Since then,  it has arguably become one of the most studied biospecimens. Researchers from around the world use NA12878 as training data for assessing pipeline performance.

Since many pipelines use some sort of machine learning algorithm when trying to determine whether a reported mutation is real or not,  the difficulty that arises is ensuring a pipeline doesn’t overfit the training data. Pipelines can ultimately be tuned, in order to maximize performance on the training dataset, and if the test data happens to be similar to the training data the pipeline’s performance would be abnormally consistent and accurate. A great resource in understanding why scientists split data into train and test roles in order to assess the accuracy, reliability, and credibility of their predictive models (the algorithm that goes into a pipeline) can be found here.

In order to test performance of pipelines in real-life, scientists needed a second reference sample and associated truth callset of which NGS pipelines have not been trained on. This is exactly what NIST and GiaB have provided in reference sample HG002.

Scientists can now evaluate algorithms using test data that is separate from the training data, an attribute  that is broadly accepted as fundamental to the evaluation methodology. Moreover, unlike NA12878, the new reference sample HG002 is male, which poses new challenges to algorithms since there is only one copy of the X chromosome, and brings new opportunity for evaluating NGS methods along this dimension.

The winners

As the clock struck midnight EST on May 25, 2016, the precisionFDA Truth Challenge closed with 36 entries across 21 teams, spanning 5 countries;  truly an international competition of epic proportions!

The winners of the Truth Challenge will be announced at the upcoming Festival of Genomics in Boston on June 29th at 8:45am EST by Elizabeth Mansfield, PhD, Deputy Director for Personalized Medicine in FDA's Center for Devices and Radiological Health's Office of In Vitro Diagnostics and Radiological Health.

Want to recreate the Truth Challenge for yourself? Join the precisionFDA community today and evaluate a pipeline of your choice against HG002.

Frost & Sullivan Recognizes DNAnexus as the Enabling Technology Leader in the Global Genomics Industry

F&SAt DNAnexus, we’re honored to be recognized by Frost & Sullivan as one of the most significant enabling technologies driving industry-changing innovation within the genomics community globally. Frost & Sullivan research involves extensive primary and secondary research across the entire value chain of specific products. Against the backdrop of this research, Frost & Sullivan evaluated DNAnexus on two key factors — Technology Leverage and Customer Impact — and compared DNAnexus to other industry players.  DNAnexus’ overall rating was significantly higher than the other industry players, resulting in the Best Practice recognition.


Solve for Scale

As next-generation sequencing (NGS) projects take off with the goal of sequencing tens or hundreds of thousands of genomes, there is a crucial need for a solution that can handle this tremendous amount of data being produced.

As DNAnexus’ own VP of Science, Andrew Carroll puts it, “Life science companies are missing a management system for dealing with petabytes of data and billions of objects. There are challenges of operating at scale – it’s not that difficult to do something that will work once or a hundred times, but when it comes to have the same system work hundreds of thousands or millions of times, there are a lot of random errors and other lower-level problems that turn out to be a big deal.” You can read more on why life sciences and genomics markets are finding cloud approaches more appealing here.

These companies are in need of an efficient way to upload, store, share, and analyze the increasingly massive amounts of data being generated. At DNAnexus, we’re harnessing the power of the cloud to do just that. We are committed to innovation in this area, powering industry and academia as they continually aim to take on more genomic data.

Secure Solutions

Because it’s uniquely tied to an individual, genomic data can be regarded as some of the most sensitive data collected. As organizations seek to make advancements in this field, they need to ensure the sensitivity of genomic information is preserved.

DNAnexus offers layers of platform features and accreditation to support an exceptionally strong security and compliance profile. DNAnexus is continually audited and certified by an independent 3rd party for compliance with ISO 27001, an internationally recognized Information Security Management System and accompanying controls, enabling DNAnexus to address a broad range of global compliance regimes. . These include HIPAA, CAP/CLIA, GCP/GLP, dbGaP, FedRAMP, and European Data privacy laws.  

Collaborative Community

Frost & Sullivan acknowledges the collaborative nature of our platform as one of its key strengths. A dataset is only as good as is the ability to access it. Additionally, as sequencing projects scale, so can the number of organizations working together to draw conclusions.

DNAnexus provides a network of collaborators in the genomics community the ability to share, transfer, access, integrate, and analyze this data – all securely and compliantly within the cloud.  DNAnexus is the Platform of choice for leading academic core labs and consortia, including the Stanford Center for Genomics and Personalized Medicine, ENCODE, 3000 Rice Genomes Project, precisionFDA and more.

Poised for Success

Frost & Sullivan recognizes our commitment to solve some of the biggest challenges facing the field of genomics. DNAnexus, and its broad network of partners, provides pharmaceutical and biotech companies, global diagnostic test providers, genome centers, and sequencing service providers secure and compliant infrastructure and scientific support to solve today’s genomic challenges faster and more effectively than ever before. The company is the platform supporting some of the largest genomic sequencing projects in the world, such as Regeneron Genetics Center, and the underlying platform for precisionFDA, the FDA’s forward-thinking initiative to evaluate the accuracy and reproducibility of next-generation sequencing bioinformatics pipelines.

We’ve been at the forefront of employing the cloud as the ideal platform for genomics and scientific computing and we’re excited to continue our work to create the global network for genomics. You can read the full Frost & Sullivan report here.

precisionFDA: Why It Matters

Screen Shot 2015-12-15 at 9.32.57 AMI hadn’t intended to write about precisionFDA going live – this post by Dr. Taha Kass-Hout and Elaine Johanson of the FDA provides a terrific summary, and this post by Angela Anderson of DNAnexus offers valuable additional context. However, I found myself today so excited by this project and what it represents that I can’t resist offering a few additional thoughts about what makes this initiative so special.

First, it addresses an important problem in the field: the analytic validity of NGS tests. The ready availability of relatively inexpensive sequencing has enabled us to contemplate diagnostic sequencing at a scale that would have been difficult to imagine even a decade ago. At the same time, the drive to apply sequencing in different clinical contexts raises a critically important question: do I trust this test? A key starting point for clinical interpretation of DNA data is to agree on the sequence itself. If your procedure and analysis reports that a particular sequence in a DNA sample is “GATCGATC” and my procedure and analysis of the same DNA says the sequence is “GATTGATC,” then we’ve got a problem. precisionFDA will allow users to compare approaches, figure out what’s working, and determine where refinements might be needed.

Second, precisionFDA represents a novel and forward-thinking approach to regulation. Rather than envisioning governmental regulators as the folks who will define and then impose a specific set of performance standards, precisionFDA instead sees the government as providing the platform that will enable the NGS community to evolve the standards on their own — organically and transparently.

Finally, the ability to design, refine, and deploy this platform in such a rapid and agile fashion reflects in part the value of well-conceptualized public-private partnerships, in this case between the FDA and DNAnexus. By intentionally leveraging the skills and capabilities of a company like ours, the FDA was able to implement and realize their exciting and ambitious vision.

The ultimate success of the precisionFDA platform will of course depend upon how well it serves the community it is intended to support. However, it’s hard to think of a more auspicious beginning, and my hope would be that success here will encourage more leaders to evaluate the potential of public/private partnerships to deploy platforms that leverage the power of a distributed innovation community to address important shared challenges.