Sentieon on DNAnexus: License-Free Access Through April 7th

Test drive Sentieon’s pipelines on DNAnexus and see how you can achieve faster and more cost-efficient results with equivalent or improved accuracy and consistency

Editor’s Note: This blog post is written by Brendan Gallagher, Business Development Director at Sentieon.

We are excited to officially announce our collaboration with DNAnexus and the availability of our tools on the scalable cloud-based DNAnexus Platform. At Sentieon, we are enabling  precision data for precision medicine by improving upon industry-leading bioinformatics tools. Our suite of secondary analysis tools produce equivalent or improved results while running much faster and more cost-efficient than GATK, MuTect and MuTect2 bioinformatics tools.

For a 2 month period, DNAnexus users will have license-free access to test, validate, and use:

  • DNAseq (BWA + GATK 3.5) – From FASTQ / BAM to VCF
  • TNseq (BWA + MuTect or MuTect2) – From FASTQ to VCF
  • Rapid DNAseq (BWA + GATK 3.5) – From FASTQ to VCF in 1 hour

Register today to take advantage of this time-limited and license-free opportunity to access Sentieon pipelines on DNAnexus! General commercial availability begins April 8th.

Sentieon tools produce equivalent results to GATK/MuTect/MuTect2 suite of tools by using mathematical methods identical to the Broad Institute’s Best Practice Workflow pipeline. Sentieon has improved the efficiency of the computation algorithms and engineered robust software implementations to speed up the pipeline while providing equivalent or improved accuracy and 100% consistency.

Delivering Better  Results

On the same compute infrastructure, Sentieon software is an order of magnitude faster in terms of core-hours, while producing 100% consistent results with no run-to-run difference. Sentieon software does not down-sample in high-coverage regions, enabling rigorous analysis for deep-coverage sequence data. By removing the down-sampling and other run-to-run error sources, Sentieon tools also improve the accuracy and quality of the results.

SPEED

Sentieon DNAseq processes a 30x NA12878 Genome from fastq to gVCF in ~6 hours and 15 minutes on a single 32 virtual core instance on the DNAnexus platform.  Sentieon and DNAnexus are also offering a rapid-turnaround distributed version of the app that can complete a 30x genome in approximately 1 hour while adding minimal additional compute cost.  

In our 2015 white paper comparing DNAseq to then-current GATK version 3.3, we showed that Sentieon DNAseq has a runtime improvement of 20-50x while producing identical results. Download the white paper.

The figure below shows the performance of our current version matching BWA-GATK3.5 on the DNAnexus Platform from FASTQ to VCF:

CONSISTENCY

Sentieon is 100% consistent and has no run-to-run variation You will get the same result every time you run an individual sample.

Sentieon Concordance to GATK – Identical within GATK’s run to run difference

The above figure shows the concordance analysis in our white paper. Over 99.8% of the variant calls produced by GATK 3.3 and Sentieon DNAseq were identical. After removing the differences from GATK downsampling, the variant calls were over 99.99% concordant. Learn more.

The same concordance performance is maintained in our current DNAseq version as compared to GATK3.5.

COST

With Sentieon, you can process a 30x genome from FASTQ to VCF for the previous price of an exome.  DNAseq costs 5x less for whole exome sequencing (WES) and 7x less for whole genome sequencing (WGS) than GATK3 run on the DNAnexus Platform.  For cost comparison purposes, please contact sales@dnanexus.com.

But don’t take my word for it.

Sentieon has been a top performer in many independent studies, and  last year, we were recognized for accuracy and consistency in the precisionFDA community challenges: precisionFDA Consistency Challenge and precisionFDA Truth Challenge.

Furthermore, Sentieon software enables large cohort joint calls with tens of thousands of whole genomes without intermediate file merging, enabling much easier and much more efficient population-scale studies.

So go ahead, register here to try it out, and let us know what you think. 

Email me anytime at brendan.gallagher@sentieon.com or talk to the DNAnexus team. 

This is the start of a nice partnership as DNAnexus and Sentieon will continue to collaborate on the acceleration and improvement of genomic analysis by providing our customers with the most accurate and cost-conscious tools. We look forward to expanding the tools available on DNAnexus in the future.

Bringing Together Genomics and Patient Data in the Cloud

Please join us Tuesday, February 7, at 10am PT (1pm ET) to hear leading genetics expert, Dr. Jeffrey Reid, Executive Director and Head of Genome Informatics at the Regeneron Genetics Center (RGC), discuss RGC’s integrated approach across genetic trait architectures and phenotypes, the underlying cloud infrastructure that makes the center’s collaboration with multiple institutions possible, and key lessons learned from RGC’s pioneering genomic sequencing study.

Webinar Details
Title: Beyond 100,000 Exomes: Insights & Lessons from Large-Scale Sequencing in the Cloud
Speaker: Jeffrey Reid, Ph.D., Executive Director, Head of Genome Informatics, Regeneron Genetics Center
Date: Tuesday, February 7, 2017
Time: 10:00 AM PT, 1:00 PM ET

Despite growing investment in biopharma research and development, the number of new drugs is not increasing. It is estimated that more than 90% of drugs that enter Phase I clinical trials fail. Among failures in Phase II clinical trials, 51% are due to lack of efficacy and 19% due to toxicity. These statistics suggest that pre-clinical models may be poor predictors of benefit, and together with data on genetically-informed development programs, indicate that human genetics data can substantially improve the likelihood of success for new therapeutics.

Regeneron has a long history of commitment to genetics-based  science, and a track record of integrating human genetics into successful development programs, delivering new medicines to patients. Therefore, the company has made substantial investment in the Regeneron Genetics Center, a cloud-based large-scale sequencing and analysis effort supporting Regeneron development programs. The RGC is a natural extension of this decades-long commitment to genetics at Regeneron, integrating large-scale, diverse data types and fostering collaboration with a wide array of stakeholders, including biopharma, healthcare providers, research institutes, and patient advocacy groups.

The Regeneron Genetics Center has sequenced more than 120,000 people so far, and has created one of the world’s most comprehensive genetics databases pairing sequence data and de-identified electronic health records. The RGC research program involves trait architectures and phenotype collaboration across a network of more than 30 research and healthcare provider institutions. Securely and easily sharing data and tools at scale with so many partners is a major challenge. In order to enable frictionless collaboration across these disparate labs, Regeneron selected DNAnexus to provide the cloud-based bioinformatics platform necessary to securely share large-scale sequencing data and tools.

In this presentation Dr. Reid will explain the RGC vision for genetics-driven drug development, describe the automation and uniquely enabling infrastructure of the RGC, and discuss in detail some of the informatics innovations and early biological insights that have already come out of the RGC’s collaborative efforts.

Leading Genome Research Center Migrates to DNAnexus on Azure

DNAnexus on Microsoft AzureToday we announced that the trusted DNAnexus genome informatics and data management platform is now also available on Microsoft Azure, Microsoft’s open, flexible, enterprise-grade cloud computing platform. Leveraging Azure, DNAnexus provides organizations a single, secure, scalable, and collaborative platform to accelerate the application of genomics within healthcare and research. The Stanford Center for Genomics and Personalized Medicine (SCGPM) is the first organization to access DNAnexus on Azure.

scgpmA key advantage to conducting genomic research in the cloud is the enhanced collaboration facilitated by data accessibility, consistency, and scalability. SCGPM researchers already have existing collaborations on the DNAnexus Platform hosted by Amazon Web Services, by extending adoption of DNAnexus on Azure means that researchers can collaborate even more widely. By leveraging DNAnexus on Azure’s powerful data-handling capabilities, a distributed network of scientists and researchers have secure access to terabytes of data through a common user interface.

DNAnexus and Microsoft are both valued partners to Stanford’s core sequencing facility. SCGPM and David Heckerman, distinguished scientist and director of Microsoft Genomics, have been in close collaboration for years. By extending the DNAnexus Platform to Azure, it is now easier for SCGPM researchers to work closely with David’s team. We believe we are just seeing the tip of the iceberg in terms of the potential for medical discovery.

DNAnexus is proud to support SCGPM on its mission to translate genomics into patient-centered medicine, and we look forward to enabling the discoveries that unfold.


DNAnexus on Microsoft AzureInnovation Through Collaboration

Through additional partnerships, Microsoft recently developed computational methods to accelerate the best practices pipeline for genome resequencing sevenfold. By improving the efficiency of the Burrows-Wheeler Aligner (BWA) and Genome Analysis Toolkit (GATK), researchers and medical professionals are able to get actionable results in just four hours, compared to the previous twenty-eight. This is critical for medical professionals to accelerate diagnosis and treatment for patients.

Genomic sequencing and analysis has become a key component of the diagnosis and treatment of cancer and other genetic conditions. This effort has both relied on and stimulated innovative technologies. At DNAnexus, we firmly believe that in order to continue innovating and further break down the technical barriers to disease, community collaboration is essential. The sharing of data and ideas between organizations – and even industries – spurs the innovation critical to medical breakthroughs. Microsoft is a global leader in technological innovation, and by partnering with leading research centers, universities, and the private sector, it is poised to make great contributions to the genomics revolution.

The DNAnexus Platform sits at the forefront of cloud-based data security, compliance, and controlled access. By co-developing with DNAnexus, Microsoft will be able to deploy their tools into an investigative environment while leveraging extensive research experience. We are excited to be collaborating with Microsoft and to offer these cutting-edge bioinformatics tools available to the genomics community via the DNAnexus Platform in the future.

Facilitating Collaboration on DNAnexus

The need for enhanced collaboration is a trend in the genomics industry we have been following for a while. DNAnexus equips end-users with out-of-the-box clinical compliance and streamlines communication between healthcare providers, reducing information silos for more efficient collaboration.

However, this notion of partnership goes deeper than groups of scientists working together to parse through datasets. Innovation and exploration are best served through collaboration, thus successful innovation in the genomics industry also relies on disparate industries working together towards a common goal. By tapping into the genomics network, the community is able to learn from each other to advance research, leading to accelerated medicine and tailored patient care.

DNAnexus is excited about the opportunity to partner with Microsoft, given their commitment to advancing the field of genomics, and their depth and breadth of experience offering solutions to the healthcare industry.