Integrating Multiple Data Sources to Power Discovery and Analysis

Precision Medicine World Conference (PMWC) took place in January in Mountain View, California, and offered attendees the opportunity to learn about innovative technologies, initiatives, and clinical case studies that are catalyzing the adoption of precision medicine in the clinic. DNAnexus was pleased to host a panel to discuss scalable infrastructure/platforms integrating next-generation sequencing (NGS) and other data (e.g. phenotypic) to power discovery and analysis in pharma and the clinic. Learn more below and watch the panel discussion.

Moderator: DNAnexus, Brady Davis, Chief Strategy Officer


AstraZeneca/MedImmune – David Fenstermacher, VP BioInformatics

Sutter Health – Greg Tranah,  Director, CPMC Research Institute, Adjunct Professor Dept. of Epidemiology & Biostatistics, UCSF

 Carol Franc Buck Breast Cancer Center at UCSF– Laura Esserman, Director

City of Hope – Sorena Nadaf, SVP & CIO



Health care providers increasingly require multi-omic datasets, including phenotypic data informed by genomic data. Such data needs to be obtained in an economically sustainable way and made available on an agile user-friendly platform so that these data may inform clinical care and lead to health improvements.

Pharmaceutical companies are interested in obtaining datasets containing phenotypic/clinical and genomic information generated from patient cohorts of specific disease areas. Such datasets can help pharma researchers identify drug targets or find biomarkers, validate hypotheses related to the interaction of genomics with disease or with specific therapies, and identify candidate populations for future clinical trials. Payers are also interested in the outcomes related to new discoveries and therapies in order to reimburse for these treatments.

This discussion focuses on how healthcare provider organizations, pharmas and payers are working toward solving these complex and challenging problems from a technical and business model perspective.



Partnership with St. Jude and Microsoft – Let’s talk about it at HIMSS 2018

We’re partnering in an exciting new collaboration with St. Jude Children’s Research Hospital and Microsoft to analyze and store half a petabyte of pediatric cancer genomic data. This collaboration will accelerate discoveries and treatments to cure pediatric cancer and other rare diseases by giving researchers and clinicians the ability to collaborate globally and enabling the rapid generation and analysis of genomic data.

DNAnexus, deployed on Microsoft Azure, provides a secure and agile ecosystem in the cloud while simultaneously eliminating security, storage and speed limitations – all of which will enable St. Jude researchers to focus on complex problems on a collaborative, global scale.  

DNAnexus’ strength comes from its agile co-development process. We partner with our customers to solve new big data problems that are continuously evolving. Our team works closely with the St. Jude and Microsoft teams to determine the specific requirements and translated it into tailored solutions. From kick-off meeting to production deployment, its a seamless process that helps our customers and collaborators achieve their goals, no matter how ambitious.

With our secure, cloud-based infrastructure and complimentary tools, researchers will be able to integrate a multitude of disparate datasets, develop their own tools, and collaborate in a secure environment enhancing the sharing of data and accelerate discoveries.

You can read more on how we’ve joined forces to fuel scientific discovery in a joint press release from St. Jude here and Microsoft has written a great blog post where you can learn more about Microsoft Genomics Service and the partnership.

We’ll be at this year’s HIMSS 2018 Conference and available at Microsoft’s booth #3832 in Las Vegas, Nevada from March 5th – March 9th, as part of the larger Microsoft patient journey providing solutions in enabling more precise treatment and better patient outcomes.

Visit us at Microsoft booth #3832 and schedule a meeting with our team – email us at

Countdown to AGBT 2018

We can’t wait for the annual Advances in Genome Biology and Technology (AGBT) meeting, taking place February 12-15 in Orlando! We are excited to join hundreds of industry leaders in the sunshine to exchange ideas about the latest advances in DNA sequencing technologies, new approaches to leveraging multi-omic datasets, and their widespread applications in healthcare.

If you’re headed to AGBT, join us for coffee (or a cocktail!) and discussion on how the DNAnexus team can work with you to integrate multi-omic data into your research, discovery, and development pipeline. Email us to schedule a meeting.

DNAnexus Events

Passport Party

Tuesday, February 13th 9:00pm-11:00pm  
Hilton Suite #1865  

Stop by our suite Tuesday night during the Passport Party to celebrate novel scientific developments at DNAnexus from deep learning applications to leveraging multi-omic data in research and development.


Poster #117: Dot: A New Interactive Dot Plot Viewer for Comparative Genomics
Presenter: Maria, Nattestad, PhD, Scientific Visualization Lead
Presentation Time: Tuesday, February 13th, 1:00pm-2:30pm

Comparing genome assemblies to genomes of related species is crucial to understanding differences between organisms across the tree of life. Dot plots are excellent tools to visualize genome-genome alignments, however traditional dot plots are static images that limit detailed investigation. We are excited to present Dot, our interactive dot plot viewer that enables scientists to visualize genome-genome alignments in order to evaluate new assemblies and perform exploratory comparative genomics.

Poster #105: How Well Can We Phase the Diploid Human Genome Using FALCON-Unzip?
Presenter: Chai Fungtammasan, PhD, Scientist  Presentation Time: Tuesday, February 13th, 1:00pm-2:30pm

Long read sequencing technology has allowed researchers to create de novo assemblies with impressive continuity, which has increased the number of reference genomes available. As a roadmap to personal genome assembly and phasing, we assess the phasing accuracy of FALCON-Unzip in human using trio information. We performed a de novo assembly of the son in the Ashkenazi trio using the data from Genome in a Bottle Consortium, concluding that the FALCON-Unzip algorithm can be used to create the long and accurate haplotype for human, and characterizes the underperformed area for future improvement.

Poster: A CLIA NGS Visual Process Monitoring for QC and Analytical Evaluations
Presenter: David Ross, CareDx

Visual data analytics (VDA) are powerful tools to efficiently develop, implement and monitor processes. VDA can rapidly provide deep analytic capabilities. We detail how VDA tools help evaluate and monitor our clinical-grade NGS assay. This assay, AlloSure®, is for the detection of donor-derived cell-free DNA (dd-cfDNA) to measure transplanted organ injury. A higher percentage of dd-cfDNA is indicative of active rejection of the allograft. The bioinformatic pipeline for the analysis is uploaded from MiSeqs into DNAnexus and then result files are downloaded to local database systems for QC, result presentation to the CLIA lab and further cross-functional analysis. Much of the cross-functional analysis and data exploration i accomplished in Tableau. The Tableau workbooks facilitate analysis of development QC, and assay status. The flow of information is immediate, visual and purpose-built for the user group and individual.

Talks Featuring the DNAnexus Platform

Joint Variant Calling on >200,000 Exome Sequences with GLnexus  
Mike Lin, PhD, VP Research & Development, DNAnexus

Date/Time: Tuesday, February 13th, 7:30pm-7:50pm
Track: Computational Biology


The vast human cohorts now being sequenced present increasing opportunities for improved genetic variant discovery by leveraging information from a whole cohort to refine conclusions about each individual. We consider the problem of joint calling, where Genomic Variant Call Format (gVCF) data, representing initial variant calls for single samples, are evaluated together to generate a multi-sample project VCF (pVCF). The pVCF provides a matrix of refined and harmonized variant calls for the whole cohort, informed by allele frequencies and error patterns observed therein. In contrast to initial gVCF generation, which is readily parallelized across samples, joint calling into pVCF presents acute scale and representation challenges for modern population sequencing projects. This is done using our new system for joint calling on large cohorts, called GLnexus.

Structural Variation Across Human Populations and Families in >23,000 Whole-Genomes 
Presenter: Will Salerno, PhD, Human Genome Sequencing Center, Baylor College of Medicine  
Date/Time: Tuesday, February 13th, 3:00pm-3:20pm
Track: Plenary Session


While the impact of small variation in well-characterized genomic regions is still being realized, it is clear that clinical-quality understanding of the full spectrum of genetic disease requires accurate assessment of large, complex variants across the entire genome for populations that span phenotypic space, including gender and ethnicity. Such structural variants (SVs) pose specific challenges with respect to detection accuracy, validation, allele reconciliation and the cost of these methods. Here we address these challenges and present the aggregation of multiple SV methods applied to whole-genome sequencing across a large human population and families.

dd-cfDNA, a Transplant Biomarker in Clinical Diagnostics – From Discovery to Clinical Practice  
Presenter: Marica Grskovic, Associate Director, R&D, CareDx
Date/Time: Thursday, February 15th, 3:30pm-3:50pm
Track: Plenary Session