Updated DNAnexus Impact Assessment for Cloudbleed: No evidence of exploitation.

As described in our February 27, 2017 blog post regarding the Cloudflare information leak (“Cloudbleed”), a  bug within the code running on Cloudflare edge servers was discovered by a Google security researcher.

Upon further investigation into the use of Cloudflare on DNAnexus we found, on February 27th at 2:39 PM PST, that contrary to what we had indicated in our blog post, HTTP requests to platform.dnanexus.com served by Cloudflare edge servers in some cases included session tokens with authentication information. We revoked all customer session tokens at 5:06 PM PST that same day, at which point all requests to DNAnexus required re-authentication. All existing tokens were unusable after this time.

On February 23rd Cloudflare provided their most recent update and stated that there was no evidence of exploitation; there have been no updates since that deviate from this information. Additionally, Cloudflare has completed analysis of edge server log data, and on March 3rd confirmed that platform.dnanexus.com was not found to have been impacted.

Our CDN usage design has been reviewed and we continue to believe no customer has been impacted by the incident. Any potential new exposure has been eliminated and there continues to be no evidence of exploitation.

We know how critical information security is to our customers so if you have any questions about your account, please do not hesitate to contact our customer support team at support@dnanexus.com.

Case Study: Trio Analysis with Sentieon Rapid DNAseq on DNAnexus

Editor’s Note: This blog post is written by Don Freed, Bioinformatics Scientist at Sentieon. Email him at don.freed@sentieon.com. 

Introduction

At Sentieon we work hard to create the most efficient, accurate and robust tools for variant calling. Thanks to our partnership with DNAnexus, we are sharing the benefits of this hard work with you.  Through April 7th, we are offering license-free access to Sentieon pipelines on DNAnexus, request access today to see how using Sentieon DNAseq you can obtain identical results to GATK at a fraction of the cost. In addition, Sentieon’s variant calling is deterministic; given identical input data, Sentieon will always call the same set of variants. Utilizing Sentieon’s tools on the DNAnexus Platform, clinicians and researchers can perform accurate and cost-effective analysis of petabyte-scale datasets with ease, seamlessly running analyses of an arbitrary number of samples simultaneously in the cloud.

Many of our customers use Sentieon tools to call variants from human samples. The typical Human genome contains some 4.5 million variants relative to the Human reference genome. While almost all of these variants are inherited, every individual has approximately 50 de novo variants, which occur uniquely in their genome. De novo variants are some of the most interesting genetic variants to study, they frequently cause rare sporadic diseases such as KBG syndrome, and have been implicated in complex disorders such as autism.

In this post, we’ll demonstrate the power of running Sentieon tools on DNAnexus by performing alignment with BWA, duplicate removal, base-quality score recalibration, indel realignment, haplotype-based variant calling and joint genotyping of a 30x whole-genome trio. Using these data, it is possible to identify de novo variants, the parental origin of some interesting inherited mutations, and examine the carrier status of this individual for rare recessive mutations. With the Rapid DNAseq app on DNAnexus, processing an entire trio takes about an hour. Whether you have a cohort of three or 3,000, by leveraging the power of the DNAnexus Platform and the scalability of the cloud, any size cohort can be processed incredibly fast.

Running analyses on DNAnexus

For this trio analysis, we used data from the Illumina Platinum Genomes dataset for individuals NA12878, NA12891, and NA12892 downsampled to 30x. The original fastq files can be found at the European Nucleotide Archive. To process the data, we used the Sentieon rapid DNAseq app on DNAnexus. We called variants in GVCF mode and input the gVCF files into the Sentieon GVCFtyper resulting in a single multi-sample VCF file for the entire trio. We easily accomplished this by using the DNAnexus workflow shown below.

In total the analysis took just 73 minutes.

We performed the same analysis with the original 50x dataset in one hour and 46 minutes. Runtimes scale approximately linearly to the input coverage.

We identified 2,458 de novo mutations in NA12878, well above the expected 50, although this increase has been previously attributed to primary cell somatic mutations or mutations introduced during immortalization and subsequent passage of the sequenced cells. We can see that NA12878 is heterozygous for both rs2472297 and rs6968865, which have been associated with increased coffee consumption.

Utilizing the DNAnexus cloud-based platform and Sentieon tools, our rapid DNAseq and joint genotyping runtimes easily scale to thousands of samples. You can view everything we ran in this public project: Rapid trio genotyping.

Register here for a free trial of the rapid DNAseq tool.

DNAnexus Not Impacted by Cloudflare Information Leak (“Cloudbleed”)

A serious bug within the code running on Cloudflare edge servers may have leaked sensitive data from a large number of websites over many months. First, and most importantly, the DNAnexus Platform has not been impacted by this incident and no DNAnexus user data has been leaked.

Cloudflare provides Content Distribution Network (CDN) services, which enable providers of web content to enhance user experience by caching web content on edge servers geographically proximate to the web client. As part of a shared service, each edge server presents web content from multiple Cloudflare customers.

The bug led to a condition whereby the edge servers were returning content entirely unrelated to the requested web content, and that leaked content contained unencrypted private information such as HTTP cookies, authentication tokens, HTTP POST bodies, and other sensitive data. Search engines subsequently crawled and cached this leaked content, enabling it to be searched. For example, a web request to a ride sharing service could have resulted in leaked content being returned from a dating service.

DNAnexus uses the Cloudflare CDN service only to accelerate serving of public web content, such as web site images, help text, and html/css. DNAnexus does not serve any credentials, tokens, nor user data via the CDN and thus DNAnexus users are not impacted by this bug, and no DNAnexus user information has been leaked.

DNAnexus users do not need to change their DNAnexus password, unless they use similar passwords for other websites that were affected. We strongly recommend that users always choose a unique password for their DNAnexus account and that they configure their account to use two-factor authentication as described in the DNAnexus wiki documentation.

If you have any questions about your account, please contact our customer support team at support@dnanexus.com.