DNAnexus Introduces Faster Cloud Options

Spring has arrived at DNAnexus, ushering in important updates! Starting May 1, 2014, we are excited to announce your analyses on DNAnexus will be faster, thanks to new instance types .

What does that really mean? Here’s an example before we dive into all the details…  A specific exome pipeline (e.g., BWA-MEM, GATK-Lite) now runs in less than 4 hours! Previously, the run would have taken nearly 6 hours.

New instance types

We believe, and hope you do too, that DNAnexus is the best choice for expanding your genomic analysis infrastructure. Because, unlike local equipment, which from day one starts collecting dust in your server room while technological advances pile up, the cloud is always on the forefront of computing technology as newer, faster hardware is made available.

These new hardware options are in the form of new instance types (virtual computer configurations) on which your cloud analyses can run. And thanks to the flexibility and reproducibility aspects of the DNAnexus platform, you can start using these new instance types right away—simply launch your existing analyses on one of those new instance types (e.g., using the “–instance-type <…>” option of our “dx run” command-line tool) and enjoy a completely effortless hardware upgrade!

The new instance types are built on high-frequency Intel® processors of the Sandy Bridge and Ivy Bridge microarchitectures, support the Intel® Advanced Vector Extensions (Intel® AVX), and have solid-state drive (SSD) local storage technology for fast I/O performance.

The following table summarizes these new instance types. For a given column (which represents a certain number of cores and local storage capacity), there are up to three different instance types to choose from (with different amounts of memory). Overall these new instance types span a large spectrum, starting at 2 cores, 32 GB SSD, and 3.8 GB RAM, all the way to 32 cores, 640 GB SSD, and 244 GB RAM:

summary new instance types
In an effort to be more informative and transparent, we have also come up with a new, easy to remember, and consistent naming scheme:

  • The prefix (mem1, mem2, or mem3) denotes the memory capacity per core;
  • the infix (ssd1) denotes that these instances have solid-state drive technology;
  • the suffix (x2 through x32) denotes the number of cores.

New names for existing instance types

We liked the convenient new naming scheme so much that we have applied it to existing instance types as well, as shown in the following table.

Compared to the new instance types mentioned earlier, the existing instance types are distinguished by a different storage infix (hdd2), given their regular hard disk drive technology. More information is available on our wiki page, which explains the new naming conventions and includes a detailed list of all instance types.

new instance names
To ease the transition, existing instances can currently be called by either their original name or the new name; the DNAnexus system understands both. However, we encourage you to adopt the new names in a timely manner to avoid any future interruption.

We are very excited to announce these important updates, and we cannot wait to hear your success stories out of them. Drop us a note at support@dnanexus.com if you’d like to get in touch with us.

2014: The Year of the Cloud


The Chinese New Year is almost upon us — and the Year of the Horse has us thinking about what 2014 will bring to the world of DNA sequencing. We believe that this will turn out to be the year of cloud computing. Here are a Chinese New Yearfew of the trends that we’re watching:

Availability of large-scale genome studies. At one point, the 1000 Genomes Project was operating on a scale all its own. Today, many organizations are participating in large-scale sequencing projects to study thousands or even millions of people. As that data makes its way into the public realm, the demand for computational resources will soar. Accessing, querying, and manipulating these data sets will present a real challenge to IT teams with bursty episodes of unusually high demand mixed with the regular stream they normally see. That’s precisely the kind of environment where cloud computing makes the most sense: having unlimited on-demand compute resources allows IT teams to meet any infrastructure needs without having to spend the money on scaling up internal resources.

The new human reference genome. The Genome Reference Consortium has released build 38 of the human genome (known as GRCh38). This is a major improvement over the last build. Once the reference has been fully annotated, scientists around the world will want to dust off their existing human data sets and realign them to the updated reference to see if there are any new insights to be had. That will mean a short-term, high-intensity spike in demand for computational resources as these massive alignments are processed — in other words, the perfect occasion to try out cloud computing. It’s the cheapest possible way to add extensive computational resources without the long-term commitment to on-premises infrastructure.

Sequencing costs keep falling. The massive genomic studies underway have all been enabled by the rapidly falling cost of DNA sequencing — a trend that promises to continue, thanks to Illumina’s recent announcement and efforts from startups still working to commercialize innovative new methods for sequencing. As sequencing a genome gets ever more affordable, demand for the resources to process and analyze that data will grow at a faster and faster pace. Trying to keep up with this demand will be an uphill battle for IT teams focused only on internal infrastructure, so we see this leading to interest in how cloud computing can help relieve the pressure from those teams to add boxes and storage components.

Growing number of analysis apps. The ecosystem of available tools for performing specific steps or types of DNA analysis is expanding rapidly. As scientists and bioinformaticians find a growing need to build pipelines utilizing a number of these tools, the ease of doing so in a cloud environment will make this option even more appealing.

Here at DNAnexus, we’re eager for what’s to come in 2014. We have a number of collaborations underway with academic and commercial R&D organizations, and we look forward to sharing details about them with our blog readers in the months ahead. Here’s to the Year of the Cloud and a great and productive year for the biomedical community!

Keep Your HIPAA-Protected Data Safer with Cloud Computing

hipaa complianceIf you’ve been considering the implications of cloud computing when it comes to HIPAA compliance, a new article in Healthcare IT News is worth a read.

The article, penned by our own general counsel Lee Bendekgey, is entitled “Cloud computing reduces HIPAA compliance risk in managing genomic data.” In it, Lee looks at the massive computational infrastructure required for handling new health data, such as genome sequences. “The resources required to process, analyze, and manage petabytes of genomic information represent a huge burden for even the largest academic research facility or healthcare institution,” Lee writes.

While it may seem counterintuitive, he adds, moving data to a cloud environment can actually improve data security. Lee considers HIPAA requirements and historic breaches of HIPAA-secured data, looking at what factors may have improved security in those situations where personal health information was put at risk.

Breaches tend to occur on items that are portable — flash drives and laptops, for instance — so keeping data in the cloud means that sensitive data never actually lives on one of these easily stolen or lost devices. Cloud computing providers routinely encrypt data while it’s in transit and at rest, adding to high-grade security. Medical organizations considering this route should ensure that a cloud provider guarantees security audits, certifications, and assessments associated with HIPAA compliance.

“By using a cloud-based service with an appropriate security and compliance infrastructure, an organization can significantly reduce its compliance risk,” Lee writes.