2014: The Year of the Cloud

 

The Chinese New Year is almost upon us — and the Year of the Horse has us thinking about what 2014 will bring to the world of DNA sequencing. We believe that this will turn out to be the year of cloud computing. Here are a Chinese New Yearfew of the trends that we’re watching:

Availability of large-scale genome studies. At one point, the 1000 Genomes Project was operating on a scale all its own. Today, many organizations are participating in large-scale sequencing projects to study thousands or even millions of people. As that data makes its way into the public realm, the demand for computational resources will soar. Accessing, querying, and manipulating these data sets will present a real challenge to IT teams with bursty episodes of unusually high demand mixed with the regular stream they normally see. That’s precisely the kind of environment where cloud computing makes the most sense: having unlimited on-demand compute resources allows IT teams to meet any infrastructure needs without having to spend the money on scaling up internal resources.

The new human reference genome. The Genome Reference Consortium has released build 38 of the human genome (known as GRCh38). This is a major improvement over the last build. Once the reference has been fully annotated, scientists around the world will want to dust off their existing human data sets and realign them to the updated reference to see if there are any new insights to be had. That will mean a short-term, high-intensity spike in demand for computational resources as these massive alignments are processed — in other words, the perfect occasion to try out cloud computing. It’s the cheapest possible way to add extensive computational resources without the long-term commitment to on-premises infrastructure.

Sequencing costs keep falling. The massive genomic studies underway have all been enabled by the rapidly falling cost of DNA sequencing — a trend that promises to continue, thanks to Illumina’s recent announcement and efforts from startups still working to commercialize innovative new methods for sequencing. As sequencing a genome gets ever more affordable, demand for the resources to process and analyze that data will grow at a faster and faster pace. Trying to keep up with this demand will be an uphill battle for IT teams focused only on internal infrastructure, so we see this leading to interest in how cloud computing can help relieve the pressure from those teams to add boxes and storage components.

Growing number of analysis apps. The ecosystem of available tools for performing specific steps or types of DNA analysis is expanding rapidly. As scientists and bioinformaticians find a growing need to build pipelines utilizing a number of these tools, the ease of doing so in a cloud environment will make this option even more appealing.

Here at DNAnexus, we’re eager for what’s to come in 2014. We have a number of collaborations underway with academic and commercial R&D organizations, and we look forward to sharing details about them with our blog readers in the months ahead. Here’s to the Year of the Cloud and a great and productive year for the biomedical community!

Keep Your HIPAA-Protected Data Safer with Cloud Computing

hipaa complianceIf you’ve been considering the implications of cloud computing when it comes to HIPAA compliance, a new article in Healthcare IT News is worth a read.

The article, penned by our own general counsel Lee Bendekgey, is entitled “Cloud computing reduces HIPAA compliance risk in managing genomic data.” In it, Lee looks at the massive computational infrastructure required for handling new health data, such as genome sequences. “The resources required to process, analyze, and manage petabytes of genomic information represent a huge burden for even the largest academic research facility or healthcare institution,” Lee writes.

While it may seem counterintuitive, he adds, moving data to a cloud environment can actually improve data security. Lee considers HIPAA requirements and historic breaches of HIPAA-secured data, looking at what factors may have improved security in those situations where personal health information was put at risk.

Breaches tend to occur on items that are portable — flash drives and laptops, for instance — so keeping data in the cloud means that sensitive data never actually lives on one of these easily stolen or lost devices. Cloud computing providers routinely encrypt data while it’s in transit and at rest, adding to high-grade security. Medical organizations considering this route should ensure that a cloud provider guarantees security audits, certifications, and assessments associated with HIPAA compliance.

“By using a cloud-based service with an appropriate security and compliance infrastructure, an organization can significantly reduce its compliance risk,” Lee writes.

Cloud Computing Insight with Omar Serang

omar serang

Omar Serang, our new Chief Cloud Officer, came to us from Amazon Web Services, where he formed the Enterprise IT Cloud Transformation consulting practice. Before that, he served as Amazon’s EC2 operations engineering manager. We chatted with Omar to get an expert’s view on cloud computing — from how it started to new innovations on their way and where DNAnexus fits in.

Q: What were the key concerns people had about cloud computing when it first began?

A: You really can’t look at any part of cloud computing without looking at Amazon as being the vanguard. When Amazon first came out with their cloud computing offering, it was perceived as a huge retail company that had excess capacity on its platform mainly because they have to engineer for very peaky volumes such as Cyber Monday. They got an idea to sell this capacity instead of having it just sit there. Early days were quite challenging for cloud: stability was certainly more of an issue five years ago, and security and compliance were probably the biggest barriers for people putting financial data or other sensitive data in the cloud.

Q: How did Amazon Web Services overcome those initial concerns and lead to a more accepted version of cloud computing?aws

A: Amazon made some very significant inroads in a number of ways. First was service quality; they are constantly learning to minimize failures. The excellence they’ve brought to bear has ended up demonstrating itself through increased platform stability and performance.

Also, they got certified for a huge raft of compliance regimes — some related to the federal government, and a broad range of compliance certification around SOC 1, SOC 2, and ISO 27001. These were certifications both for operational excellence and for security best practices, and they went a long way toward opening people’s eyes to AWS as an option. They capped that off by doing marquis projects with companies like FinQloud with NASDAQ, which is all about financial information security. That really served to knock down barriers to cloud adoption.

Finally, Amazon did a lot of work to expose the total cost of ownership so people could compare their on-premise versus cloud costs for the same thing. A study came out in Germany that showed a company could run an SAP cluster for 70 percent less cost than their on-premise dedicated infrastructure.

Q: Did any particular event help clear up security concerns around data in the cloud?

A: In late 2011, the CIA technology chief stated at an AWS conference that the cloud is more secure than traditional approaches. When that happened, IT managers started taking a serious look at how they could create reliable and secure infrastructures in the cloud that would actually surpass the reliability and security of their on-premise infrastructures.

Q: How is the cloud evolving right now?

A: The macro model that’s emerged includes infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS). Infrastructure as a service is pretty well accepted. There are still some holdouts — people who think it’s better to do a dedicated setup — but I really feel like that is going to become a dying breed in three to five years. The idea that people can do it by themselves better than it can be done in the cloud is starting to recede.

What I’m seeing emerge now is this concept of managed services that sit on top of the infrastructure as a service layer. This is where DNAnexus comes in: providing a PaaS that runs on top of infrastructure as a service, and offers a really high degree of value in software and expertise. We’re finding that the real attraction is this concept of a managed service which brings to bear compute, software, and analytics along with the expertise that we have in genetic science, bioinformatics, and cloud computing. It’s not just a bunch of computers running in the cloud. It’s a group of people and a platform that they’ve developed based on their genomic analysis expertise that allows customers to take advantage of the cloud in a way that’s frictionless, secure and compliant, and enabling for their collaborative efforts.

Q: What are some upcoming innovations that will affect this space?

A: I think that global reach will increase for cloud providers. Storage — specifically block storage — is also going to see some major advances. I expect to see some advances in direct-attached storage instead of network-attached storage. Another will be long-haul networking; that will evolve in some very interesting ways.

Q: What drew you to DNAnexus?

A: The real key here is that there are two disruptive technologies coming together: next-generation sequencing and cloud computing. They need each other because of the bursty nature of genomic analysis, but you also need a wrapper of HIPAA and CLIA compliance and PHI protection all around it. It’s a really compelling story and it’s what makes DNAnexus really exciting: it’s at the confluence of these two game-changing technologies.